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FOREWORD 


Both Evolutionary Plant Breeding (EPB) and Participatory Plant 
Breeding (PPB) have a long history. The original idea of EPB traces 
back to 1929, but even earlier, in 1908, Herbert J. Webber, a highly 
respected professor at Cornell University, wrote “Plant-Breeding for 
Farmers”. On the first page he writes: “No farmer is so poor that he can- 
not afford a small plot for the improvement of corn, wheat or potatoes. 
In fact, it can be said that no farmer can afford not to have such a plot 
to produce its own seed of different crops” (Webber 1908). Ten years 
later, in 1917, Henry A. Wallace encouraged farmers to experiment with 
crossing varieties of corn (Wallace 1917) and thought that the only way 
for breeders to discover new strains of corn was to rely on the exper- 
tise of the knowledgeable corn farmers themselves. There are other ex- 
amples of this interest in collaboration, which interestingly ended, for 
example in the USA and Italy, in conjunction with the introduction of 
hybrid corn (Fitzgerald 1993). 

For several years research was conducted on the role of populations 
and mixtures. PPB almost disappeared until it was proposed as partici- 
patory research by Rhoades and Booth (1982). 

In this manual we will first describe PPB and its scientific basis. We 
will then introduce the scientific basis of EPB, and illustrate EPB as a 
methodology that benefits from, but does not necessarily require, the 
participation of an institution. 

The manual presents EPB development, achievements, and advan- 
tages and presents details on the several ways in which farmers can 
manage evolutionary populations (EPs) and produce their own seed. 

Eventually we will show the relationship between EPB and PPB and 
how EPs can generate useful material for implementing a PPB program. 

The manual includes some elementary concepts of experimental 
statistics, which do not pretend to be exhaustive as there are sever- 
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al books and manuals that readers can consult. These concepts have 
been added to help understand the experiments on which the science 
of EPB is based. 

This manual is intended for plant breeders, but also for technicians, 
extension agents, and practitioners with a basic knowledge of genetics. 


DEDICATION 


This manual is dedicated to Prof. Martin Wolfe who passed away on 
March 10, 2019 at the age of 81. 

Martin worked as a plant pathologist in the Plant Breeding Institute 
of Cambridge from 1960 until 1988, when the institute closed. He then 
occupied the chair of plant pathology at the Swiss Federal Institute of 
Technology in Zurich until 1997. 

Starting in 1994, he developed Wakelyns Agroforestry in Suffolk, 
which represents one of the first research centers on agroforestry in the 
United Kingdom and is the center where Martin did pioneering work on 
the development of cereal populations. In 1998, he began contributing 
to the development of the research program at the Organic Research 
Centre before becoming its main scientific consultant and participating 
in some projects. 

In 2017, he became Professor of Genetic Improvement for Sustaina- 
ble and Resilient Agriculture at the University of Coventry. 

Martin devoted much of his scientific career to the study and promo- 
tion of agrobiodiversity. 

Already in the mid-80s, Prof. Martin Wolfe was a very critical voice 
on the approach used in genetic improvement for disease resistance 
based on the use of individual genes for resistance. Faced with the 
objection that the problem could be overcome by combining different 
genes for resistance in a single plant, he replied, “you are creating the 
ideal conditions for a disaster because the pathogen will adapt very 
quickly to the combined resistance of the host”. 

Prof. Martin Wolfe was among the first to explore the advantage of 
mixtures in then-Eastern Europe by using malting barley mixtures that 
quickly spread on over 10,000 hectares in Poland, and particularly in 
East Germany where they were introduced in 1984. They were so suc- 
cessful because the mixtures had been formulated specifically for pow- 
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dery mildew resistance and malting quality. At the end of the 1980s, 
the mixtures covered almost all the area cultivated with malting spring 
barley (over 300,000 hectares) with a reduction from 50% to 10% in the 
incidence of the disease, while the use of fungicides was reduced to just 
one treatment on over 100,000 hectares. There was no decrease in pro- 
duction while the quality of the malt was considered satisfactory. Sadly, 
with the unification of the two Germanies, the project was abandoned 
due to the preference of Western European malters for malt obtained 
from individual varieties even if treated with fungicides! 

Even if that project was abandoned, the vision of Prof. Martin Wolfe 
is more alive than ever at a time when, like never before, the idea of re- 
turning to cultivating diversity has returned to the forefront of research. 
The wheat populations which he helped to develop and which he stud- 
ied over many years are being grown in several countries in Europe and 
continue to generate insights into the potentials of plant diversity. 

Martin’s enduring and endearing enthusiasm are living on in the 
many farmers and scientists he was able to convince of the need for 
diversification. 


1. 
INTRODUCTION 


The global issues most frequently debated today in international re- 
ports (IPES-FOOD 2016; Development Initiatives 2017; FAO 2016; 
FAO, IFAD, UNICEF, WFP, and WHO 2021) and reviews (WHO/CBD 
2015) are climate change, poverty, malnutrition (which includes un- 
dernutrition and obesity), water scarcity, and the loss of biodiversity 
in general and of agrobiodiversity in particular. These issues are often 
covered separately even though they are closely interconnected, and we 
argue that they are interconnected through seed. 

Seed is related to climate change because we need crops better adapt- 
ed to the climate as it changes. 

The concept of “planetary boundaries” was proposed in 2009 to 
define a “safe operating space for humanity” (Rockstr6ém et al. 2009). 
The boundaries include climate change, rate of biodiversity loss, 
ozone depletion, acidification of the oceans, human interference with 
nitrogen and phosphorus cycle, global freshwater use, change in land 
use, chemical pollution, and atmospheric aerosol loading. Four of the 
nine boundaries, namely climate change, rate of biodiversity loss, hu- 
man interference with nitrogen and phosphorus cycle (Steffen et al. 
2015), and global freshwater use (Jaramillo and Destouni 2015) have 
been already crossed. 

The four processes affect agricultural productivity, and in fact there 
has already been a decline in crop resilience as recently shown in the 
case of wheat in Europe (Kahiluoto et al. 2019). However, the argument 
of resilience is still debated, as Piepho (2019) disputed the previous 
claim and a new methodology to estimate resilience has been recently 
proposed (Zampieri et al. 2020). 

Climate change is a complex breeding objective (Ceccarelli and 
Grando 2020a) because 1) changes in temperature and rainfall are 
likely to vary from location to location (Altieri et al. 2015) and are 
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still largely uncertain with efforts being made to quantify the uncer- 
tainty (Weaver and Zwiers 2000); 2) climate change is not only about 
temperature and rainfall, but it affects the distribution and outbreak 
of pests (Altieri et al. 2015; Heeb et al. 2019), in particular the spec- 
trum of insects (Zavala et al. 2008; Deutsch et al. 2018) including 
pollinators such as bumblebees (Kerr et al. 2015), diseases (New- 
ton et al. 2011; Pautasso et al. 2012; Juroszek et al. 2020; Miedaner 
and Juroszek 2021) and weeds (Ziska and Dukes 2010; Colautti and 
Barrett 2013; Matzrafi et al. 2016); 3) extreme weather events can 
influence the interactions between crops and pests in an unpredicta- 
ble way (Rosenzweig et al. 2001). The expansion of the geographical 
ranges of several insects, weeds, and pathogens has been documented 
in the USA (Rosenzweig et al. 2000). Tropical storms are additional 
events, which may contribute to the spreading of diseases (Campbell 
and Madden, 1990; Lehmann et al. 2020). Eventually, the fact that cli- 
mate impact often exceeds 10% of the rate of yield change, indicates 
that climate changes are already exerting a considerable drag on yield 
growth, which will affect food prices as well (Lobell et al. 2011). 

All this evidence points at climate change as an extremely complex 
and evolving problem (Altieri et al. 2015), which requires an evolving 
solution. 

Seed is associated with food as most of our food comes directly or 
indirectly from plants, and, through food and child nutrition, is related 
to poverty (Save the Children 2012). Seed is related to water, because 
about 70% of water is used to irrigate crops (http://www. fao.org/nr/wa- 
ter/aquastat/infographics/Withdrawal_eng.pdf) and seed of crops and 
varieties able to produce an economic yield with less water will make 
more water available for other human uses. Seed is also associated with 
malnutrition; the three crops from which we derive about 60% of the 
calories and 56% of the proteins from plants — namely maize, wheat, 
and rice (Thrupp 2000; FAO 2013) — are far less nutritious than crops 
such as barley (Grando and Gomez Macpherson 2005), millets, and 
sorghum (Dwivedi et al. 2011; Boncompagni et al. 2018). Millets and 
sorghum need less water than maize, rice, and wheat, which use nearly 
50% of all the water used for irrigation of crops. 

Finally, seed is related to biodiversity in general and to agrobiodiver- 
sity in particular. At the farm level, agrobiodiversity can be in the form 
of growing different crops, different varieties within the same crop, and 
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heterogeneous (genetically non-uniform) material. Out of the more than 
6,000 plant species that have been cultivated for food (IPK 2017), few- 
er than 200 species have significant production levels globally, with 
only nine (sugar cane, maize, rice, wheat, potatoes, soybeans, oil-palm 
fruit, sugar beet, and cassava) accounting for over 66 percent of all crop 
production by weight (FAO 2019). 

Agrobiodiversity is important for food security (Cardinale et al. 2012; 
Hooper et al. 2012; Zimmerer and de Haan 2017), for increasing farm 
income, for generating employment, and for reducing the risk of yield 
losses (Di Falco and Chavas 2006, Pellegrini and Tasciotti 2014; Re- 
nard and Tilman 2019). Biodiversity makes production systems more 
resilient (FAO 2019) and is an essential resource for crop improvement 
to adapt agriculture to a changing climate and consumer preferences 
(Hufford et al. 2019; van Frank et al. 2020). 

Agrobiodiversity is also important in relation to ecosystem services 
(Jackson et al. 2007), which represent the benefits human populations 
derive, directly or indirectly, from ecosystem functions. For simplicity, 
we will refer to ecosystem goods and services together as ecosystem 
services. Ecosystem services include gas regulation, climate regulation, 
disturbance regulation, water regulation, water supply, erosion control 
and sediment retention, soil formation, nutrient cycling, waste treat- 
ment, pollination, biological control, refugia, food production, raw ma- 
terials, genetic resources, recreation, and cultural services (Costanza et 
al. 1997). A literature review compared biologically diversified farming 
systems with conventional farming systems and found that diversified 
systems support substantially greater biodiversity, soil quality, car- 
bon sequestration, water-holding capacity in surface soils, energy-use 
efficiency, and resistance and resilience to climate change. In other 
words, biodiversity is beneficial in terms of several ecosystem services 
(Kremen and Miles 2012). 

The importance of biodiversity is confirmed by a recent paper, which 
reviewed 5,160 original studies comprising 41,946 comparisons be- 
tween diversified and simplified practices. It showed that, overall, di- 
versification enhances biodiversity, pollination, pest control, nutrient 
cycling, soil fertility, and water regulation without compromising crop 
yields (Tamburini et al. 2020). 

However, plant breeding, the science responsible for producing new 
varieties in the last 70 years, has gone towards uniformity (Frison et 
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al. 2011; Brumlop et al. 2019). As Frankel wrote, “from the early days 
of plant breeding, uniformity has been sought after with great determi- 
nation. For this there are many reasons — technical, commercial, his- 
torical, psychological, aesthetic” (1950). He added that the concept of 
purity “has not only been carried to unnecessary length but... may be 
inimical to the attainment of highest production” since it is “concerned 
with characters which are readily seen but often of little significance”. 
Note that Frankel never used terms such as “scientific” or “biological”. 

However, Frankel went largely unheard as today most modern varie- 
ties are pure lines, hybrids, or clones depending on the crop and on the 
market demand. Pure lines are reproduced through inbreeding, clones 
through asexual reproduction or cloning. In both cases, their offspring 
are identical to the parents. Hybrids are crosses between pure lines or 
clones. All those processes dramatically reduce genetic diversity. How- 
ever, the uniformity of varieties alone would not be sufficient to ex- 
plain the decline of agrobiodiversity because plant breeding could have 
produced many of them. The decline in agrobiodiversity is due to the 
predominant philosophy being followed by breeders (both public and 
private) with regards to two fundamental concepts, namely adaptation, 
defined as the ability of a variety to perform well in a given location, 
and stability, defined as yield consistency across years (Barah et al. 
1981; Lin and Binns 1988; Evans 1993). 

Most plant breeders claim that, from a practical point of view, breed- 
ing should aim at developing varieties with both a stable and high yield 
and a wide adaptation (see for example, De Vita et al. 2010). While a 
variety with a high stability of high yield in a given location is obvi- 
ously what every farmer would like to have, particularly in view of the 
large year-to-year weather fluctuations (Baethgen 2010), wide adapta- 
tion, namely stability across locations, obviously leads to a decrease in 
agrobiodiversity and is only in the interest of seed producers. 

Though adaptation (specific vs wide) is one of the most widely de- 
bated concepts in plant breeding, adaptation by natural selection is the 
fundamental principle of the theory of evolution. Furthermore, an early 
step in the process of adaptation and diversification is the differential 
adaptation of populations of the same species (Hereford 2009), hence 
an increased within-species diversity. 

This decline in agrobiodiversity has increased the vulnerability of 
crops (Esquinas-Alcazar 2005; Hajjar and Hodgkin 2007; Keneni et al. 


Introduction 15 


2012; Fisher et al. 2018) because their genetic uniformity makes them 
unable to respond to both short- and long-term climate changes. A re- 
cent study shows that, globally, climate variability accounts for roughly 
a third (between 32 to 39%) of the observed yield variability. In the 
case of wheat, a crop for which we grow predominantly uniform vari- 
eties, climatic variability explains 31 to 51% of the variability in yield 
in Western Europe and 23 to 66% of the variability in yield in Eastern 
Europe, while in Southern Europe climatic variability is responsible 
for 15 to 45% of the yield variability in Italy and Greece and more than 
75% in Southern Spain (Ray et al. 2015). 

In addition, uniform crops provide an ideal breeding ground for the 
rapid emergence of fungicide-resistant variants (Fisher et al. 2018). An 
extreme example of the consequences of low diversity is the potato late 
blight epidemic and ensuing famine in 19th century Ireland (Machi- 
da-Hirano 2015). 

An example of the importance of agrobiodiversity comes from the 
work of Tilman and Dowing (1994) who, in a long-term study of grass- 
lands, found that more diverse plant communities are more drought re- 
sistant and recover faster from drought. This led them to claim that 
biodiversity generates stability. A re-analysis of their data revealed that 
this stabilizing effect was the result of the statistical averaging between 
species that did not change synchronously over time. 

In a paper published in 1952, Harry Markowitz, using simple analyt- 
ical and graphical models, developed the “portfolio” theory, a mathe- 
matical theory that predicts the conditions for which the mean (or the 
sum) of random and independent variables would be progressively 
more stable as more variables are averaged or summed. The theory was 
used, and it is still used to validate the value of diversification as a fi- 
nancial investment strategy to reduce risk. 

The results of Markowitz’s analytical and graphical models are well 
known to financial advisors, who advise risk-averse customers on a di- 
versified portfolio, which for the “portfolio theory” tends to produce more 
stable returns than do less diversified portfolios, despite the variability of 
the individual components. In fact, in a well-diversified portfolio it is more 
likely to have individual products that are independent or inversely related. 

The diversity-stability hypothesis (i.e., greater diversity corresponds 
to greater stability) was recently tested by Renard and Tilman (2019) 
who analysed the production of 176 crops in 91 countries from 1961 to 
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2010 (50 years). The stability of national yields was defined as S = uu/o 
in which yw is the mean national yield in kcal ha" for a period of years, 
while o is the year-to-year standard temporal variation for the same 
period (the section on Experimental Design and Statistical analysis ex- 
plains the meaning of these parameters and how they are calculated). 
The data set was subdivided in 5 periods of ten years each and uu and o 
were calculated for each period. 

Diversity was calculated as Shannon information index (Hill 1973) 
both as crop group diversity (the groups were cereals, vegetables, fruits, 
pulses, oil crops, sugar crops, stimulants, spices) and as individual spe- 
cies diversity. 

Their data clearly indicate that greater diversity of crops at the na- 
tional level corresponds to a greater stability (highly significant and 
positive regression coefficient) of crop production over time (figure 1) 
and the destabilizing effect (highly significant and negative regression 
coefficients) on crop production of rainfall and temperature variability. 
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Figure 1. Regression coefficients (+ se) of different variables in a multiple regression on log, 
(national yield stability); a. Crop Group Diversity; b. Crop Species Diversity (redrawn from 
Renard and Tilman 2019). 


This greater stability reflects in marked differences of years with 
sharp harvest losses. On average, a country with a mean stability of 
5 experiences a national yield decline of more than 25% once every 8 
years (figure 2), whereas countries with mean stabilities of 7.5, 10, or 
15, only once every 21, 54, or 123 years. 
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Figure 2. National yield stability and probabilities of crop harvest losses (redrawn from Renard 
and Tilman 2019). 


Crop diversity has been shown to be highly beneficial particularly 
in restricting the development of diseases (Zhu et al. 2000; Phillips 
and Wolfe 2005; Déring et al. 2011; McDonald 2014; McDonald and 
Stukenbrock 2016) as well as their evolution (Palumbi 2001). For ex- 
ample, in China, the use of variety mixtures led to a 94% reduction of 
blast and 89% increase in yields compared to monoculture and to avoid 
fungicidal treatment within two years (Zhu et al. 2000). One of the most 
notable examples of the advantages of mixtures is the spreading of bar- 
ley mixtures in the former German Democratic Republic up to 360,000 
hectares with a reduction of the percentage of fields affected by severe 
mildew epidemics from 50 to 10% and a threefold reduction of the per- 
centage of fields sprayed with fungicides (Wolfe et al. 1992). 
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The relationship between diversity and stability has been at the center 
of the debate among ecologists (McCann 2000). Before the 1970s, ecol- 
ogists believed that more diverse communities enhanced ecosystem sta- 
bility. These early intuitive ideas were challenged by the work of Rob- 
ert May (1973) who, using linear stability analysis, found that diversity 
tends to destabilize community dynamics. However, the most recent 
literature indicates that diversity can be expected, on average, to give 
rise to ecosystem stability. What is of interest in relation to evolutionary 
populations, is that diversity is not the driver of this relationship; rather, 
ecosystem stability depends on the ability for communities to contain 
species, or functional groups, that are capable of differential response 
(McCann 2000). 

The decline of biodiversity has been associated with the increase of 
inflammatory diseases in humans, ranging from inflammatory bowel 
disease to ulcerative colitis, cardiovascular disorders, various liver dis- 
eases and many types of cancer (von Hertzen et al. 2011). In turn, the 
increase in the frequency of inflammatory diseases has been associated 
with a decreased efficiency of our immune defences (von Hertzen et al. 
2011). Recently, the association between the microbiota — namely the 
complex of bacteria, viruses, fungi, yeasts, and protozoa that is in our 
intestine — and our immune system and with the likelihood of contract- 
ing inflammatory diseases has been confirmed (Khamsi 2015). 

Diet strongly influences the microbiota: a change in diet changes its 
composition in just 24 hours. It takes 48 hours, after changing the diet 
back again, before the microbiota returns to its initial conditions (Singh 
et al. 2017). 

Given the important roles of the microbiota on the one hand, and 
the fact it is so strongly and rapidly influenced by diet on the other, it 
is understandable that there have been many studies on the effect of 
various diets (Western, omnivorous, Mediterranean, vegetarian, vegan, 
etc.) on its composition and diversity (Singh et al. 2017). Recent results 
demonstrate that diet diversity is of paramount importance for having 
a healthy microbiota (Heiman and Greenway 2016; Ceccarelli 2019). 
But, how can we have a diversified diet if the agriculture that produces 
our food is based on uniformity? 

More recently, the most extreme expressions of biodiversity loss, 
namely deforestation and species extinction, have been related with an 
increased risk of pandemics such as COVID-19 (Tollefson 2020). 
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The decline of biodiversity can be largely attributed to the change 
of plant breeding from the way it was done by farmers for millennia to 
the way it is done by scientific institutions and corporations. In fact, a 
peculiar characteristic of farmers’ breeding was that selection was typ- 
ically for specific adaptation, and it would not have been possible oth- 
erwise. To be more precise, it was for specific adaptation in space and 
wide adaptation in time as innovations, such as new seed, were tested 
for a number of years before being adopted. In addition, in the absence 
of seed companies, farmers were procuring the seed for the following 
cropping season by collecting the seed from selected plants (presum- 
ably those that had desirable characteristics). Therefore, by necessity 
they had to mix the seed of the selected plants to obtain enough seed. By 
doing so, they maintained diversity within their fields, generating what 
today we call traditional varieties, farmers’ varieties, heirloom varieties, 
or landraces. Moreover, as each farmer selected for specific adaptation 
to her/his conditions and uses, this also generated diversity between 
fields. This is fundamentally different from formal plant breeding. 


2 
PLANT BREEDING PROGRAMS 


Formal or institutional plant breeding programs, which started be- 
ing developed at the beginning of the 20" century, are characterized by 
three main stages (figure 3) (Schnell 1982), namely: 

1) generating genetic variability, that includes selection of parents, 
making crosses, crossing techniques, choice of type and number of 
crosses, induced mutation, introduction of germplasm from germplasm 
banks or from other breeding programs, or from farmers; 

2) selection of the best genetic material within the genetic varia- 
bility created or acquired in stage one. This stage lasts a number of 
cropping seasons (5 to 10) depending on the crop and on the meth- 
odologies used; 

3) testing of breeding lines, that includes comparisons between exist- 
ing cultivars and the breeding lines emerging from stage 2 using the ap- 
propriate experimental designs and statistical analysis to conduct such 
comparisons. This stage lasts at least 3 cropping seasons. 

The most common way of generating genetic variability in stage 
1 is to make crosses among parental lines selected for specific traits. 
The number of crosses generated at the beginning of each cycle can 
vary from few hundred in small breeding programs to several thou- 
sand in large international breeding programs. Use of accessions from 
germplasm banks and new germplasm collections are other ways of 
bringing genetic variability in the program. 

During stages 2 and 3, genetic variability is gradually reduced, and 
promising breeding lines are identified. While the number of breeding 
lines decreases, the quantity of available seed per line increases along 
with the number of locations in which the material can be tested (Cec- 
carelli 2009). If we consider that a new cycle starts every year (or twice 
a year), the amount of material a breeding program handles on a yearly 
basis is large. 
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A stage by itself is not the process: for example, a farmers’ evaluation 
of a germplasm collection is just “‘a farmers’ evaluation of a germplasm 
collection” unless it is a component of the entire process aimed at iden- 
tifying useful diversity and/or parental material. 

Molecular tools such as marker assisted selection and genomic selec- 
tion are designed to improve the efficiency in identifying useful genetic 
combinations in stage 2. However, as we will see later, they do not nec- 
essarily translate into a higher breeding efficiency, namely into a higher 
efficiency of the entire process. 
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Figure 3. Main stages of a breeding program. 


In its most common implementation, the key stages of a formal plant 
breeding program, namely the creation of variability and particularly 
the selection stage, take place in one or few research stations (central- 
ized selection) under optimum or near-optimum agronomic conditions, 
including use of fertilizers and pesticides. In some cases, breeding pro- 
grams on research stations are managed to simulate specific stress con- 
ditions (Venuprasad et al. 2007). 
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A typical centralized plant breeding program suffers from two main 
problems. Firstly, centralization makes it difficult to address issues such 
as the client profile (breeding for whom), the product profile (breed- 
ing which type of variety), and the environment profile (breeding for 
where), which vary, even considerably, from location to location and 
within the same location, with time (Ceccarelli and Grando 2022). Sec- 
ondly, there is an issue of breeding efficiency because of the negative 
effect of genotype x environment interaction on genetic gains (Cobb 
et al. 2019), an issue we will deal with later. The main consequence of 
these two problems is that varieties developed are specifically adapted 
to environments like the research station or made similar to that envi- 
ronment by using the same agronomic management including mechani- 
zation, fertilizers, irrigation, and chemical plant protection applications. 
It is therefore inevitable that selection is for wide spatial adaptation 
with the objective of identifying one or few superior varieties which 
can grow successfully on several locations and for which centralized 
seed production becomes profitable. We have argued that breeders’ 
wide adaptation is often wide in a geographical rather than environ- 
mental sense (Ceccarelli 1989). Therefore, it is not surprising that cen- 
tralized breeding has caused a generalized decrease of agrobiodiversity, 
even though the reduction of diversity associated with plant breeding 
is somewhat controversial: for example, Landjeva et al. (2006) found 
that genetic diversity did not decline in Bulgarian winter wheat while 
Bonnin et al. (2014), using an integrative indicator of genetic diversity 
developed by Bonneuil et al. (2012), found a decline in the genetic di- 
versity of wheat during the 20" century. Reiss and Drinkwater (2018) 
reached similar conclusions. More recently, Gatto et al. (2021) used a 
large dataset on the varietal release dynamics for 11 major food crops 
in 44 countries of Asia and Africa and found an increasing reduction of 
crop varietal diversity linked to the spatial displacement of traditional 
landraces (Dwivedi et al. 2016). 

Decentralized selection, defined as the deployment of the genetic varia- 
bility generated in the first stage of a breeding program in the target popu- 
lation of environments where selection is eventually done, has the potential 
of solving these problems. However, when it is used in the form of MET 
(Multi Environment Trials) within a conventional breeding program, it fails 
to do so, because only the best varieties across the range of environments 
are selected, rather than the best varieties in each environment. 
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A breeding program based on decentralized selection becomes 
very effective when combined with the collaboration of the stake- 
holders, and even more so if the stakeholders’ collaboration begins 
with the identification of priorities and objectives. When this is 
done, a breeding program becomes a participatory plant breeding 
(PPB) program. A decentralized participatory breeding program is 
particularly suited to serve organic agriculture (Ceccarelli and Gran- 
do 2020b; Colley et al. 2021). 

Therefore, PPB has the power to reverse the decrease of agrobiodi- 
versity caused by centralized breeding by replacing centralized selec- 
tion with decentralized selection and selection for wide adaptation with 
selection for specific spatial adaptation done jointly by farmers and sci- 
entists. Eventually, these two characteristics of PPB are also the reasons 
for its lack of institutionalization. 


ee 
PARTICIPATORY PLANT BREEDING 


A breeding program becomes participatory when farmers (we fo- 
cus on farmers as they are the most common clients of plant breed- 
ing programs, which obviously often address other types of clients) 
share with the scientists all the most important decisions during all 
the stages shown in figure 3. Depending on when the participation 
starts, a distinction has been made between PPB and participatory 
variety selection (PVS). The latter term is used when farmers’ par- 
ticipation begins in stage 3, at testing of breeding lines. While, on 
one side, PVS is technically easier to organize because farmers are 
only involved in expressing their opinion on the limited number of 
lines that usually reach that stage (Ceccarelli et al. 2000), on the other 
side it leaves to them a very limited number of choices to make. Fur- 
thermore, with PVS there is a risk for breeding material potentially 
desirable to farmers to be discarded before they even see it. However, 
because it is simple to organize, PVS can be a valuable entry point to 
start experimenting with the participation of farmers assuming that 
PVS is fully decentralized. 

To integrate customer profile and product profile, stages such as 
social targeting, demand analysis (Weltzien and Christinck 2009), 
and seed production and distribution of cultivars (Bishaw and van 
Gastel 2009) have been added to the breeding cycle modifying figure 
3 in figure 4. 
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Figure 4. Main stages of a breeding program revisited (modified from Tufan et al. 2018). 


4. 
WHY FARMERS’ PARTICIPATION 
IS STILL MARGINAL 


Over the years, PPB collected success stories and had several rec- 
ognitions, some of which are presented in Ceccarelli and Grando 
(2019). However, farmers’ participation is still marginal; here we dis- 
cuss whether one reason could have been the perception of PPB having 
weak scientific basis. 

The origin of PPB, in its widest connotation, traces back to the pa- 
per by Rhoades and Booth (1982), which presented an “alternative 
approach to solving farm-level technological problems”. The research 
was conducted at the Centro International de la Papa (CIP) based in 
Lima, one of the CGIAR centers, and supported by funds from CIP, the 
Rockefeller Foundation, and the International Development Research 
Centre, Canada. The paper emphasized the advantages of an interdis- 
ciplinary (research teams working together) over a multidisciplinary 
(teams fulfilling independent specific disciplinary roles and passing on 
information) approach. 

That publication, together with the one that followed (Rhoades et 
al. 1986), supported the principle that when developing a new agricul- 
tural technology, it is necessary to start with the farmers, who must be 
involved in the process, rather than ignoring them and eventually giv- 
ing them a “beautiful” and ready-to-use technology. When applied to 
plant breeding, this represented a reversal of the model defined as “del- 
egative” (from the French délégatif) by Bonneuil and Demeulenaere 
(2007) and Thomas et al. (2011), in which agricultural production, seed 
production, varietal innovation and conservation of genetic resources 
changed from being part of farmers’ activities to be functionally sepa- 
rated and delegated to specialized scientists, while the farmers lost the 
responsibilities for innovation and conservation (figure 5). 

By the time the paper was published, the delegative model was al- 
ready so well established that the world described by Kloppenburg 


28 Evolutionary Plant Breeding 


(2010) was nearly forgotten: “they (the farmers) decided what seeds 
to plant, what seeds to save and who else might receive or be allocated 
their seed as either food or planting material. Such decisions were made 
within the overarching norms established by the cultures and communi- 
ties of which they were members”. 

There was a process of dispossession of both genetic material and of 
knowledge with the establishment of a consolidated system of power, au- 
thority and control; PPB implies changes on such a system and was there- 
fore considered very radical and perhaps even subversive (Crane 2014). 

Several nearly concomitant factors might help explain why the idea of 
participation was not received with enthusiasm by professional breeders 
(Belay 2009). Firstly, as indicated above, the proposal of participatory 
research came from social scientists who also conducted early PPB ex- 
perimentation with the institutions and practices of biophysical scienc- 
es often either left invisible or assumed to be purely technical (Crane 
2014). At about that time, a conventional, centralized non-participatory 
plant breeding was already well established, addressing an industrialized 
agri-business model (seed market, pesticide market, and food industry). 
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Figure 5. From Farmers’ breeding (left) to Institutional/Corporate Breeding (right) (redrawn 
from African Centre for Biodiversity 2018). 
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The development of a new vocabulary describing, for example, partic- 
ipation in plant breeding as being conventional or contractual, consulta- 
tive, collaborative, collegial, or even considering farmer experimenta- 
tion as a type of participation (Sperling et al. 2001) gave an impression 
of PPB as a static phenomenon, while field experience shows that in a 
truly participatory program, as the farmers becomes more empowered, 
the process itself evolves (Desclaux at al. 2012). But it is likely that the 
real reason why PPB never “took off” is that it represents the reversal of 
the dispossession process described earlier, by making possible a repos- 
session by farmers of the entire process but mostly of seed production 
and exchange (Ceccarelli and Grando 2019; 2020c). 

The possible reasons for the general institutional reluctance to use 
PPB as the strategy in their breeding programs have been discussed re- 
cently by Ceccarelli and Grando (2021; 2022). They range from conven- 
tional and biotechnological methods dominating university curricula on 
plant breeding to the reward system in public institutions, which is still 
largely based on the number of varieties released. A more fundamental 
reason is the reluctance to accept the paradigm shift that PPB inevitably 
implies in “seed sovereignty” and, consequently, “food sovereignty” 
(Ceccarelli and Grando 2019; 2020c). In several countries, it has been 
reported that institutional support was mostly of a personal nature and 
ended when the person left. However, despite the generalized reluc- 
tance, PPB is widely practiced globally (Ceccarelli and Grando 2020c; 
Colley et al. 2021), mostly by institutions, such as several universities, 
which do not have plant breeding as their institutional mandate. 

Therefore, if we want to reverse the trend towards uniformity 
and monoculture, with all their consequences included those on our 
health (Ceccarelli 2019), and we want to shift from “cultivating 
uniformity” to “cultivating diversity”, we need to use an approach 
which makes the participation of institutions an option rather than a 
necessity as it is in PPB. 


3: 
THEORY OF SELECTION: FROM CENTRALIZED 
TO DECENTRALIZED SELECTION 


Before we discuss how to go beyond PPB, we need to start from what 
is known as the breeder’s equation: 


R=Sh? (1) 


this clarifies the scientific basis of the shift from cultivating uniform- 
ity to cultivating diversity. 

In the formula (1), the response to selection (R), or genetic gain, de- 
pends on the selection differential (the difference between the mean of 
the selected individuals and the mean of the whole population), indi- 
cated as S, and on the broad sense heritability (or repeatability) of the 
target trait, indicated as h* (Falconer 1981). The selection differential 
is determined by the intensity of selection (i), which in turn depends 
on the percentage of individuals selected and is equal to S/o, where o 
is the square root of the phenotypic variance. Therefore, the breeder’s 
equation can be expressed also as 


R10. h? (2) 
It is worth remembering that 
h?=07/02=67/607+ 6 7/e +0 7/(re) (3) 
g Pp g g ge e 
where So oly ome and o” are the phenotypic, environmental, gen- 
otypic < environment interaction (GEI), and genotypic variances, re- 
spectively, and r and e are the number of replications and the number 


of environments (locations, years, or location—years combinations), re- 
spectively. GEI plays a key role as we will see below. 
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The point to be made is that the breeder’s equation, expressed as 
(1) or (2) assumes that the selection environment and the target envi- 
ronment are the same, or, in other terms, it only estimates the “direct” 
response to selection or genetic gain. 

However, the most common case is that the selection environment 
(one or more research stations) is different from the target environment 
(the farmers’ fields in one or more countries), and therefore, the breed- 
er’s equation, no matter how expressed, is irrelevant because the re- 
search station cannot possibly represent the multitude of often diverse 
target environments, except in very specific situations such as when no 
or only trivial GEI exists. 

Therefore, much more relevant is to calculate the genetic gain as 
the correlated response to selection (CR) in the target environment(s), 
which is equal to 


CR, = R, h?/h? r, OF ih, h, r, 9, (4) 


where R, is the genetic gain in the selection environment (direct re- 
sponse), h,” is the heritability in the target environment, h~ is the herit- 
ability in the selection environment, r, is the genetic correlation coef- 
ficient between the measures of the trait object of selection in the two 
environments, and om is the phenotypic standard deviation of the trait 
in the target environment. 

By manipulating (4), the relationship between CR, and R, is given by 


CR/R, =r, (h/h)) (5) 


which tells us that: 

a) When h, = h,, the maximum value of CR/R, is 1, when r, = 1; in 
other words, when heritabilities are the same, direct selection will al- 
ways be more effective than correlated response (R, > CR,) because a 
genetic correlation coefficient equal 1 has a very low probability; 

b) With low genetic correlations (0.1—0.2), which are often found 
between high-yielding breeding nurseries and low-yielding target en- 
vironments (Atlin et al. 2001), h, must be at least 5—10 times higher 
than h, for CR, to be greater than R.. A literature review showed that the 
ratios between h_and h, reported in the literature (Ceccarelli 1996) does 
indicate that generally R, > CR, (Ceccarelli 1994); 
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c) Heritability alone is not sufficient to determine the optimum selec- 
tion environment because when r_ is negative, as in the case of GEI of 
crossover type, the ratio h/h, becomes irrelevant. 

Given that one distinctive feature of PPB is decentralized selection, 
the arguments above provide the theoretical framework to discuss the 
advantages and disadvantages of centralized versus decentralized se- 
lection: one argument has been that centralized selection, namely se- 
lection in the research station, maximizes heritability, while decentral- 
ized selection, namely selection in the target environment, maximizes 
correlation (Ceccarelli 1996). The weakness of the argument is that it 
assumes that heritability is intrinsically higher in the favourable and 
controlled conditions of a research station, an assumption that does not 
always hold true (Ceccarelli 1994; Dawson et al. 2008). Furthermore, 
low heritability poses a problem for genomic selection as well, because 
its accuracy depends on the size of the training population, the herita- 
bility and the effective number of loci (Bassi et al. 2017). 

When heritability is considered across the whole range of the target 
population of environments (TPE), it can be manipulated by conduct- 
ing Multi Environment Trials (MET) (see in the Experimental De- 
sign and Statistical Analysis section for details) to partition GEI into 
Genotype x Locations (GL) and Genotype x Years within Locations 
(GY/L). This allows a measure of the repeatability of GL and hence 
a subdivision of the TPE into subgroups in such a way that within 
each subgroup GEI is minimized, h’ is maximized and hence R is also 
maximized by selecting for specific adaptation between and for wide 
adaptation within each subgroup. 

GEI has been at the center of the debate between advocates of “wide 
adaptation” and of “specific adaptation”. This is partly due to the con- 
fusion about E: in most literature, E could be locations (L), or years 
(Y) or, even worse, a combination of L and Y. Yet already 50 years 
ago, Allard and Hansche (1964) and Allard and Bradshaw (1964) spec- 
ify that GY and GL cannot be combined, because the former is large- 
ly unpredictable while GL can be, to some extent, predictable. While 
decentralized selection can make a positive use of GL interactions by 
selecting for specific adaptation, varieties well buffered against unpre- 
dictable fluctuations of the environment are the solution to GY. This 
can be achieved through individual and population buffering. While in- 
dividual buffering is a property of specific genotypes, and particularly 
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of heterozygotes, population buffering arises by the interactions among 
the different genotypes within a population, beyond the individual buff- 
ering of the specific genotypes. Therefore, the advantage of heterogene- 
ous populations is that they can exploit both individual and population 
buffering. 

The theory of selection offers the theoretical backup to decentralized 
selection. In fact, studies conducted in Australia (Pederson and Rathjen 
1981; Cooper et al. 1997) to evaluate the relevance of research stations 
for their suitability as selection environments have found that, in many 
cases, the genetic correlations between the yield of breeding lines on 
the research station and yield under on-farm conditions were low in 
comparison to the genetic correlations between different on-farm ex- 
periments. Therefore, while lower experimental errors and higher her- 
itability could usually be achieved on the research stations, the results 
were found to have limited relevance to genotype performance in the 
on-farm TPE. Consequently, there was an investment into capability for 
conducting large breeding trials under on-farm conditions at all stages 
of the breeding program to increase the chances of conducting MET 
that were accurately targeted to the farming systems (Banziger and 
Cooper 2001). 

Evolutionary biology offers an additional support to decentralized 
selection as the strategy to breed for specific, namely local, spatial ad- 
aptation, by demonstrating that a) locally adapted plants have a higher 
performance (Leimu and Fischer 2008), b) locally adapted plants have 
a 45% higher fitness than introduced genotypes, and c) adaptation to 
one environment comes at a cost of adaptation to other environments 
(Hereford 2009). By contrast, few studies evaluated local adaptation 
(Kissing Kucek et al. 2021b). 


6. 
GENETIC GAINS AND BREEDING EFFICIENCY 


One major issue in research for development programs — and plant 
breeding is no exception — has been and still is the adoption by farmers’ 
communities and in general by clients, of the final products (Rhoades 
and Booth 1982; Rhoades et al. 1986). Low adoption rate has been and 
still is a major problem as recently confirmed by Alary et al. (2020) and 
Thiele et al. (2020); the CGIAR is no exception, with many varieties 
released but never adopted by farmers (Kholova et al. 2021) arguably 
as a consequence, at least partly, of GEI. 

Selection theory, as we discussed so far, does not and cannot predict 
whether the product obtained by maximizing genetic gains, even in the 
target environment, is what farmers need, and this is an additional limi- 
tation of genomic selection, which only increases genetic gains but not 
necessarily breeding efficiency (Bassi et al. 2017). 

The disconnection between breeding activities and seed production, 
and the inefficiencies of several formal seed supply systems have been 
two of the factors negatively affecting adoption (Bishaw and Turner 
2008). Breeders circumvented the problem, claiming that adoption does 
not belong to selection theory, and it is not a breeder’s problem, but a so- 
cio-economic issue. Therefore, in public breeding programs, “breeding 
efficiency” has been often measured as the number of varieties accepted 
for official release. This means applying the breeder’s equation to the va- 
riety release system, which is notoriously based on severely flawed trials 
(Tripp et al. 1997). Variety release, unrelated to adoption, still drives the 
professional careers of public breeders in several countries. 

The literature on adoption is large and tells that in a centralized, 
non-participatory breeding program, it is difficult to predict adoption 
(Ceccarelli 2015). Also, the real reasons behind adoption (or lack of it) 
are not always easy to understand as shown by the case of adoption of 
hybrid corn in the USA between 1933 and 1945, which has been attrib- 
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uted not only to the superiority of the hybrids, but also to a succession 
of droughts that made corn seed difficult to find and to the policy of 
corn acreage reduction, which tempted farmers to try the more pro- 
ductive hybrids (Fitzgerald 1993). While private breeding companies 
can drive adoption by both strong advertising and seed market control 
— often adoption is simply a lack of choices — public plant breeding can 
increase adoption rate through full, non-discriminatory, gender inclu- 
sive, participation of farmer communities in the entire breeding pro- 
cess. Participation increases adoption because potential adopters take 
part in the selection work, if the participants’ representativeness of a 
wider community of clients is ensured. 

Centralized-participatory plant breeding, i.e., farmers taking part 
in selection work at a research station, is not a valid alternative be- 
cause at research stations farmers select different entries compared 
with their own fields (Ceccarelli et al. 2000; Ceccarelli et al. 2001; 
Reguieg et al. 2013). 


ip 
THE SCIENCE OF EVOLUTIONARY PLANT BREEDING 


7.1. Definitions 


In the following pages we will use several terms ranging from bulk 
populations, composite crosses (CC), evolutionary populations (EP), 
and mixtures. 

Bulk populations, CCs and EPs are synonyms (hereafter we will use 
mainly the term EPs, unless in citing papers) and the fundamental dif- 
ference with mixtures is shown in figure 6. 

A mixture is made by mixing a given quantity of seed, in general 
number, of different varieties of the crop in question (figure 6, left). An 
EP is made by mixing the seed obtained by crossing different varieties 
(figure 6, right). The ideal EP would be made by crossing in all possible 
combinations a certain number of varieties. The progenies of crosses 
can be further crossed, requiring more cycles of crossing as we will see 
in an example later (figure 23). The crossing can be done by farmers, 
but is usually more conveniently done by breeders, who make crosses 
routinely. In such a case, this is one of the participatory aspects of an 
evolutionary-participatory breeding program. 

The farmers can make mixtures and/or populations using the varieties 
they find on the market, or their own traditional varieties or a bit of both. 

In the case of mixtures, we need to make a distinction between ‘stat- 
ic mixtures’, namely mixtures which are made up by mixing a given 
number of seed of each component at the beginning of each cropping 
season. They are static because, although such physical seed mixtures 
are genetically more complex than monocultures and can therefore be 
subjected to natural selection, they do not capture the effects of natural 
selection occurring in the field. When seed produced from mixtures is 
used as seed for the following crop, thus capturing the effects of natural 
selection, we suggest using the term ‘dynamic mixtures’. In dynam- 
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ic mixtures, over a few generations, even in self-pollinated crops, low 
levels of out-crossing will occur leading to segregation and new gene 
combinations. When this occurs, the genetic structure of the crop would 
then move from being a ‘dynamic mixture’ to becoming a ‘population’ 
(Wolfe and Ceccarelli 2020). In the pages that follow we will use the 
term mixtures to indicate dynamic mixtures. 


Figure 6. The difference between a mixture and a population: a mixture is obtained by mixing 
seed of different varieties while a population is obtained by mixing the seed obtained by cross- 
ing different varieties. A mixture can be static or dynamic (see text). 


7.2. Research on mixtures and evolutionary populations 


The science of EPB is based on research initiated by Harlan and Mar- 
tini (1929), a barley agronomist and a barley botanist, respectively, at 
the Office of Cereal Crops and Diseases of USDA (United States De- 
partment of Agriculture), Washington DC. They used the term Compos- 
ite Hybrid Mixtures rather than ‘evolutionary populations’, which was 
used for the first time by Suneson nearly 30 years later (Suneson 1956). 
Another term used in some of the early papers was ‘bulk breeding’. 

Harlan and Martini (1929) proposed the composite cross (CC) meth- 
od of plant breeding and they synthesized a barley CC (called CCID) by 
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pooling an equal number of F, seeds obtained by 378 crosses among 
28 superior barley cultivars representing all the major barley growing 
areas of the world. 

The way they selected the 28 barley cultivars is of interest: during 
his association with the Office of Cereal Crops and Diseases of USDA, 
Harlan had grown about 5,000 varieties of barley coming from all over 
the word and from this large number he selected those that had shown 
some promise in the United States (table 1). 

The 378 crosses resulted by crossing the 28 cultivars in all possible 
combinations excluding reciprocals and selfing, namely 28(28-1)/2. 
Since the handling of 378 crosses was burdensome it was planned 
(the paper described the plan) to mix an equal number of seed of each 
cross, to distribute the population to whoever desired making use of 
the material. 

The CCII was used by several scientists in the years that followed, as 
we will show below. 

One important contribution of Harlan and Martini was the demon- 
stration that natural selection can modify the composition of plant pop- 
ulations grown under competitive conditions. 

Harlan and Martini (1938) analysed the composition of a mixture 
made with an equal number of seeds of 11 barley varieties (Trebi, Coast, 
Hannchen, Manchuria, White Smyrna, Smooth Awns, Lion, Meloy, 
Svanhals, Gatami and Deficiens). All these varieties can be easily dis- 
tinguished from each other except for the first two. The mixture was 
sent to 10 research stations in the USA: in each station the mixture was 
grown in a plot and at harvest, sufficient seed was saved to plant a plot 
the subsequent year and to send a sample to Washington. Therefore, 
this was a dynamic mixture. This sample was space-planted to allow 
counting the number of plants belonging to each variety in a random 
sample of 500 plants. Ifthe mixtures maintained their original composi- 
tion, a random sample of 500 plants should contain about 45 plants per 
variety and about 90 for Trebi and Coast that could not be distinguished 
from one another. The total number of plants of each variety surviving 
at the end of the experiment (table 2) shows that this was not the case 
and how rapidly few varieties become dominant in specific locations 
such as Manchuria at Ithaca, White Smyrna at Moro and Hannchen at 
St. Paul, and how rapidly poorly adapted types (such as Deficiens or 
Meloy) were eliminated everywhere. 
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Table 1. The 28 varieties selected by Harlan to produce the barley 
Composite Cross IT (CCI) (redrawn from Harlan and Martini 1929). 


Name CI. Nr Origin Fertility Awns Colour | Remarks 

Horn 926 North Europe 2 Rough White 
Hannchen 531 North Europe 2 Rough White 
Wisconsin Winter 2159 | Southeast Europe | 6 Rough White 
Winter 

Orel 351 Russia 2 Rough White 
Arequipa 1256 Northwest Africa | 6 Rough White 
Algerian 1179 Northwest Africa | 6 Rough White 
Lion 923 Northwest Africa | 6 Rough Black 
Atlas 4118 Northwest Africa | 6 Rough White 
Sandrel 937 Northwest Africa | 6 Rough White 
Maison Carrée 3387 | Northwest Africa | 6 Rough White 
Club Mariout 261 Egypt 6 Rough White 
California Mariout |3625 | Egypt 6 Rough White 
Good Delta 3801 | Egypt 6 Rough White 
Minia 3556 =| Egypt 6 Rough White 
White Smyrna 910 East 2 Rough White 

Mediterranean 
Palmella Blue 3609 | East 2 Rough White 
Mediterranean 

Trebi 936 Armenia 6 Rough White 
Multan 3401 India 6 Rough White 
Lyallpur 3403 | India 6 Rough White 
Everest 4105 | Mt. Everest 6 Rough White Naked 
Manchuria 2330 | Manchuria 6 Rough White 
Oderbrucker 4666 Eurasian Plain 6 Rough White 
Han River 206 China 6 Rough White 
Flynn 1311 | Hybrid 6 Rough White 
Glabron 4577 | Hybrid 6 Smooth White 
Alpha 959 Hybrid 2 Rough White 
Golden Pheasant 2488 | Hybrid 2 Rough White 
Meloy 1176 | Hybrid 6 Hooded | White 
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Since the Coast-Trebi totals are the sum of the number of plants of 
two varieties, the figures are too large to be compared with those of the 
other varieties. However, the large numbers of this combination at so 
many places are partly due to the fact that Coast is well adapted to the 
West, while Trebi is well adapted to the region from Idaho east. 


Table 2. Final census showing the effect of natural selection in a mix- 
ture of barley varieties grown at 10 locations for 4 to 12 years, recorded 
as the number of plants of each of 11 varieties found in a population of 
500 plants (redrawn from Harlan and Martini 1938). 


Number of plants of each variety in the year when last grown 
. Arling- | Ithaca, St. Fargo, | North | Moc- | Aber- | Pull- | Moro, | Davis, 
Variety ton, NY Paul, ND, | Platte, | casin, | deen, | man, OR, CA, 
VA, 1936 | MN, 1930 NE MT, ID, WA, 1934 1928 
1928 1934 1932 1936 1936 1930 
Coast and 446 57 83 156 224 87 210 150 6 362 
Trebi 
Gatami 13 9 15 20 7 58 10 1 0 1 
Smooth Awns 6 52 14 23 12 25 0 5 1 0 
Lion 11 3 27 14 13 37 2 3 0 8 
Meloy 4 0 0 0 7 4 8 6 0 27 
White 4 0 4 17 194 241 157 276 489 65 
Smyrna 
Hannchen 4 34 305 52 13 19 90 30 4 34 
Svanhals 11 2 50 80 26 8 18 23 0 2 
Deficiens 0 0 0 1 3 0 2 | 0 1 
Manchuria 1 343 2 37 1 21 3 1 0 0 


Suneson and Wiebe (1942), in experiments with barley mixtures, 
found that competition between varieties grown in mixtures caused re- 
sults which differ from those expected based on the performance of 
varieties in pure stand: in other words, a high yielding variety in pure 
stands can be a poor competitor in a mixture and vice versa. Similar 
results were obtained by Piano and Ceccarelli (1978). 

The results of research on mixtures and evolutionary populations will 
be described according to the trait (or the traits) studied. 
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7.2.1. Evolutionary populations and plant height 


The competitive advantage of tall plants over short plants was re- 
ported by Bal et al. (1959) in barley, by Jennings and Herrera (1967) in 
rice using a cross between a tall and a dwarf variety, and by Khalifa and 
Qualset (1975) in wheat. 

As in previous studies, for example Hockett et al. (1983), during the 
evolution of the CCs there was a shift towards taller plants and late ma- 
turity. An increase in plant height associated with a decrease in kernel 
number per spike and kernel size, likely due to a reallocation of resourc- 
es, was observed by Goldringer et al. (2001). An increase in plant height 
was also observed by Knapp et al. (2020), associated with late flowering 
time. The increase in plant height during the evolution of cereal mixtures 
and populations has been often considered one of the major drawbacks 
(Denison et al. 2003) of evolutionary breeding since cereal breeding has 
sought short plants which do not lodge even when heavy doses of nitro- 
gen fertilizers are applied. This may be true if one keeps in mind industri- 
al agriculture. However, in organic agriculture, plant height is beneficial 
as one of the plant traits that contribute to weed control. 

In evolutionary theory, “being tall” in the case of cereals is a “selfish 
behaviour” due to one or more “selfish genes”, which favour the indi- 
vidual that possesses them at the expense of others, as opposed to an 
“altruistic behaviour” due to one or more “altruistic genes”, which are 
beneficial to the population rather than to the individual, like the ability 
to suppress weeds (Weiner et al. 2017). However, for the reasons men- 
tioned above, in organic agriculture considering as selfish those genes 
that determine a trait such as plant height, whose effects are beneficial 
for the entire plant community, is questionable. 


7.2.2. Evolutionary populations and phenology 


The effect of natural selection on heading or flowering and maturity 
time varies according to the environment and the population (Bal et al. 
1959; Allard and Jain 1962; Singh and Johnson 1969). Robert W. Allard 
was a pioneer in experimenting with bulk populations, a different term 
to indicate CCs and EPs. 

One of the crosses he used was between Zuiho and Noren 20, a late 
heading and an early heading rice variety, respectively, which was 
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grown as a bulk from F, to F, generations at 20 rice research stations 
scattered all over Japan. Every year a random sample of the populations 


was grown in a common location in Central Japan. 
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Figure 7. Histograms showing changes in heading time of a hybrid rice population grown at 


various latitudes in Japan (redrawn from Allard and Hansche 1964). 


Natural selection changed considerably the heading time (figure 7). 
The average heading time of the populations grown in northern loca- 
tions (such a Sapporo) shifted gradually towards earliness, going from 
F, to F, and becoming similar to Noren 20, while those grown in more 
southern locations (such as Chikugo and Miyazaki) shifted towards 
lateness, becoming similar to Zuiho (Allard and Hansche 1964). 

In Australia, mixtures with contrasting phenology were shown to be 
able to stabilize the combined risks of frost, heat, and drought stress in 
dryland environments (Fletcher et al. 2019). 
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Phenological diversity within EPs and mixtures, which is often seen 
as an undesirable attribute in relation to harvesting, is actually benefi- 
cial as mixtures of components differing in phenology yield more that 
the individual components (Borg et al. 2018). 


7.2.3. Evolutionary populations and yield 


One of the classical experiments that best illustrates how the effects 
of natural selection on yield can vary with environments and popula- 
tions is that by Patel et al. (1987). In this experiment, from the same 21 
crosses between 7 varieties (4 cultivars and 3 advanced selections) of 
barley (excluding reciprocals and selfing), two different types of popu- 
lations were derived. The first was a composite cross (CC), which was 
derived by bulking an equal number of F, seeds from each cross. The 
second was a mixture of doubled haploids! (DH): originally, they were 
supposed to be 20 DH for each of the 21 crosses but for some crosses 
the number of DH was less than 20 so the total number of DH was 398 
instead of 420. This was called the DH mixture. 

Therefore, the CC was a population made of segregating genotypes 
with a large recombination potential. On the contrary, the DH mixture 
was made of highly homozygote genotypes and therefore the possibility 
of recombination was left to the chance of natural crossing. According 
to the terminology used in this manual, the CC used in the experiment is 
an evolutionary population, while the DH mixture is a dynamic mixture. 

The CC and the DH mixture were grown in two locations (that we 
will indicate as O = Elora Research Station in Ontario, with a humid 
climate, and P = Agriculture Canada Research Station at Charlottetown 
in Prince Edward Island, with a maritime climate). 

After the first year, the seed collected at each location was divided 
into two parts: one was always kept at the same location while the other 
was grown alternatively in the two locations, back and forth. As a re- 
sult, there were 4 populations and 4 mixtures, which were designated 
as follows: 

— CC-OO (grown only and always in Ontario) 

— CC-PP (grown only and always in Port Prince) 


1. = Adoubled haploid (DH) is a genotype formed when haploid cells undergo chromosome 
doubling. Artificial production of doubled haploids is important in plant breeding. 
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—CC-OP (grown alternatively in the two locations starting with Ontario) 

— CC-PO (grown alternatively in the two locations starting with Port 

Prince) 

DH-OO, DH-PP, DH-OP, DH-PO were the DH mixtures handled as 
described for the CC populations. The populations were grown for 5 
years without any artificial selection. For the comparison trial, 50 plants 
were randomly selected from each of the 8 populations, multiplied for 
one year, and eventually tested in two locations in Ontario together with 
the seven parents. 

The most interesting results (figure 8) were: 

1. The CC populations were higher yielding then the DH mixtures 

(figure 8A); 

2. Non alternated populations and mixtures were higher yielding than 

alternated populations and mixtures (figure 8B); 

3. The population that evolved in Ontario (OO) (figure 8C) was con- 

siderably higher yielding than the population that evolved in Port 

Prince (PP); 

4. There was no difference between the alternated (OP and PO) pop- 

ulations and mixtures (figure 8D); 

5. The proportion of lines extracted from the two populations out 

yielding significantly the mean of the seven parents was higher in the 

CC populations (53%) than in the DH mixtures (35%). 

Among the most important conclusions of this paper is that natu- 
ral selection reduced the frequency of low yielding genotypes and in- 
creased mean yield. This effect was higher in the CC which, because 
of its genetic structure, had more chances of generating new genotypes. 
Furthermore, the experiment showed that natural selection improves 
yield when evolutionary breeding is used within the intended region of 
adaptation. This fits with the original description of the core features 
of evolutionary breeding by Suneson (1956) as “a broadly diversified 
germplasm and a prolonged subjection of the mass of the progeny to 
competitive natural selection in the area of contemplated use”. One of 
the characters of major interest is of course yield, and the evolution of 
yield of several populations without conscious selection is shown in 
figure 9 (Allard and Hansche 1964). In early generations, yields were 
conspicuously inferior to those of standard locally adapted varieties. 
This is not surprising because each of the populations was based on 
adapted and unadapted parents. The fact that yields improved rapidly in 
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all cases provides evidence that natural selection is a powerful force in 
eliminating unadapted genotypes. 
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Figure 8. (A) frequency distribution of grain yields of the original 398 DH lines and of the lines 
within the DH mixtures and CC populations; (B) the alternated and non-alternated populations 
and DH mixture; (C) the OO and PP populations and DH mixture; and (D) the OP and PO 
populations and DH mixture. The arrow indicates the mean grain yield of the seven parents 
(redrawn from Patel et al. 1987). Information about the original DH lines was obtained from an 
experiment that was conducted in 1978. 
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Figure 9. Yield in various generations of four populations (two of barley and two of beans) ex- 
pressed as percentage of the yield of a standard commercial variety made equal to 100 (redrawn 
from Allard and Hansche 1964). 


Rasmusson et al. (1967) evaluated a barley composite population ob- 
tained by mixing seed of 6,000 entries from the barley world collection — 
based on what we said earlier, in this case the term population was used 
improperly. The population was grown under severe stress, imposed by 
late planting, in Minnesota (USA) for 6 years. During this time the com- 
posite was advanced using a random lot of seed from the previous year’s 
harvest, hence without any conscious selection. Remnant seed from each 
year was ultimately planted in yield trials to ascertain the change, if any, 
in the yield potential of the composite. The population’s yield increased 
significantly, indicating a substantial response to natural selection. The 
improvement amounted to 57% during the six years, or an average of 
9.5% for each year of natural selection (figure 10). 
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Figure 10. Relation between yields of the composite and years of natural selection (redrawn 
from Rasmusson et al. 1967). 


Suneson (1956) compared four different barley CCs grown in Cali- 
fornia with the widely grown variety Atlas 46 (a parent in all the CCs) 
and showed that all four populations evolved to produce higher yields 
(figure 11). Furthermore, while after 12 generations not a single line 
yielded more than Atlas 46, after 20 generations a line out yielding At- 
las 46 by 37% was identified. In a later generation (F,,), three top selec- 
tions showed a 56% greater yield than Atlas 46. 

Similar results were obtained by Soliman and Allard (1991) with 
three composite crosses, namely CCII, (the one obtained by intercross- 
ing 28 barley cultivars), CCV (similar to the CCII except that the in- 
tercrossed parents were 30) and CCXXI (synthesized from 6,200 bar- 
ley accessions from the USDA collection — hence a mixture and not a 
population). For each of the three CCs they grew different generations: 
generations 13, 23 and 45 for CCII, generations 5, 10, 21 and 30 for 
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CCV, and generations 5, 9, 14 and 16 for CCXXI). The material was 
grown in Davis from 1977 to 1982 and a steady yield increase was 
observed in all the CCs. For example, the CCV gained 21% in yield in 
25 generations, while the rate of increase was even higher for CCXXI 
(16% in 11 generations). 


Yield in percent of Atlas 46 


i) 4 8 12 16 20 24 28 
Generation 


Figure 11. The grain yield of four Composite Crosses in percentage of Atlas 46 (yield made 
equal to 100) over successive generations (redrawn from Suneson 1956). 


7.2.4. Evolutionary populations and disease resistance 


Several scientists investigated the evolution of resistance to fungal 
pathogens in a number of composite crosses and mixtures, indicating 
that a higher level of disease resistance is one of the major advantages 
of populations and mixtures. 
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Simmonds (1962) reported several cases of reduced severity and inci- 
dence of diseases in mixtures of varieties. In a review of mixture cultivation 
in both developing and developed countries, Smithson and Lenné (1996) 
suggested more durable resistance to insects and diseases as one of the 
perceived advantages of mixtures over their components and possibly one 
of the reasons for larger and more stable yields. They reported 29 examples 
of wheat and barley mixtures with decreases in disease severity of more 
than 30%, but also cases of increase in disease resistance. In a more re- 
cent review of the use of mixtures for disease management, Mundt (2002) 
suggests that the most important mechanism to explain the reduction in 
severity of diseases in mixtures is the dilution of inoculum that occurs due 
to the distance between plants of the same genotype. However, there is also 
a large variation in the efficacy of mixtures in reducing disease incidence. 

Jackson et al. (1982) analysed the resistance to scald (Rhynchospori- 
um secalis [Oud.] Davis) of four generations of the CCV (F,, F.,, F,, 
and F,,) and found a larger than expected number of families resistant to 
more than one race and a high proportion of segregating families even 
after several generations of selfing, suggesting a higher-than-expected 
outcrossing rate or a larger advantage of the heterozygotes. They con- 
cluded that, as the CCV also evolved towards an increased yield and 
maintained a high level of disease resistance, it may be considered a 
good source of material for conventional breeding programs. 

Allard (1990) used the CCII and the CCV, described earlier, and 
found that resistance alleles that protected against the most damaging 
pathotypes of scald increased sharply in frequency in the host popula- 
tions, and concluded that the evolutionary processes that take place in 
genetically variable populations propagated under conditions of culti- 
vation can be highly effective in increasing the frequency of desirable 
alleles and useful multilocus genotypes. 

Ibrahim and Barret (1991) studied the evolution of resistance to mil- 
dew in the CCV, grown in Davis, California since 1937. In 1974, sam- 
ples of generations F,, F,,, and F,, were brought to Cambridge, UK 
where they were grown as parallel populations (population 1, 2 and 3) 
ever since. The study showed (figure 12) that 1) there have been large 
directional shifts towards increased resistance; 2) there are differences 
between the three populations in the rate of increase of the frequency of 
resistant plants; and c) there was a strong increase in the frequency of 
resistant plants at almost the same time in the three populations. 


The Science of Evolutionary Plant Breeding 51 


\ 


Fl2 F15 F18 F21 F22 F25 F28 F31 F32 F35 F38 F41 


Population 1 Population 2 Population 3 


Figure 12. Percentage frequency distribution of plants resistant to natural mildew infection in 
field experiments (redrawn from Ibrahim and Barret 1991). 


More recently, agricultural diversity measured by the richness in va- 
riety diversity has been shown to be associated with a decrease in the 
average damage level in banana, plantain, and bean in Uganda (Mulum- 
ba et al. 2012). 

In Uganda, the problem of bean resistance to the bean stem maggot 
or bean fly was first addressed by analysing 48 varieties, including tra- 
ditional and modern. This led to the identification of a range of levels of 
resistance (figure 13): from this study one resistant and one susceptible 
variety were identified and then the resistant variety (Kasirira, a tradi- 
tional variety) and the susceptible variety (Nabe 4, a modern variety) 
were used to constitute two components mixtures. 
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Figure 13. Infestation by bean stem maggot in 48 traditional and modern bean varieties in 
Uganda. The two arrows show the resistant traditional variety Kasirira and the modern suscep- 
tible Nabe 4, used in the mixture experiment (redrawn from Ssekandi et al. 2016). 


The mixtures were made with different proportions of the two vari- 
eties (25% resistant:75% susceptible; 50% resistant:50% susceptible; 
75% resistant:25% susceptible) and were arranged in the field either 
as a systematic random arrangement (figure 14, left) or as alternate 
rows (figure 14, right). Here we report only the results on root dam- 
age where the highest reduction was obtained with the combination 
of 50% of the resistant variety and a systematic random arrangement 
(Ssekandi et al. 2016). 

A meta-analysis of 11 studies conducted on a total of 161 mixtures 
to study the effect of mixtures compared with pure stand in controlling 
stripe rust in wheat (Huang et al. 2012) showed that about 83% of the 
mixtures (133 out of 161) had a positive effect of reducing disease in- 
tensity (figure 15). 
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Figure 14. The two different spatial arrangements used in the bean stem maggot experiment in 
Uganda (redrawn from Ssekandi et al. 2016). 
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Figure 15. Frequency distribution of the difference in effect size between mixtures and pure 
stands in papers from 1950 to present (B) and the values of effect size (A) where a value <0 
indicates a positive effect of cultivar mixtures on disease reduction, while values >0 indicate a 
negative effect (redrawn from Huang et al. 2012). 


In another study, mixtures of two, three, four, or five winter wheat 
(Triticum aestivum) cultivars and their component in pure stands were 
either exposed to or protected from two races of stripe rust (Puccinia 
striiformis). Disease severity in the mixtures compared to the mean of 
the components was reduced between 13 and 97%. Changes in disease 
severity could be separated into two effects. First, selection changed 
the frequencies of the cultivars in the mixtures by up to 35% at harvest 
compared to the planted frequencies. Reductions in overall disease se- 
verity in mixtures due to selection for the more resistant cultivar were 
as high as 42% and increases in overall disease severity due to selection 
for the more susceptible cultivar were as high as 11% over the mean 
disease severity in the pure stands. Second, disease severity on individ- 
ual cultivars was reduced below that observed in pure stands because of 
the epidemiological effect of host diversity. Mixtures yielded between 0 
and 5% more than the mean of the pure stands in the absence of disease. 
In the presence of disease, mixing increased yield between 8 and 13% 
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(Finckh and Mundt 1992). The advantage of mixture in reducing the in- 
cidence and severity of fungal diseases has been demonstrated in sever- 
al studies (McDonald et al. 1988; Finckh et al. 2000; Finckh and Wolfe 
2006). In a study by McDonald et al. (1988), the components of the 
material used for simple mixtures (with two, three, or four components) 
were extracted either from the parents of the CCII described earlier, or 
from the F,. generation of the same CCII. What was interesting was 
that the maximum reduction in the disease was observed in the mixtures 
made of components that were susceptible when grown in pure stand. 

The components of a mixture are often chosen for their contrasting 
reaction to diseases. Vidal et al. (2017) showed that the effect of a mix- 
ture to control disease development can be enhanced by constructing 
mixtures with a different canopy architecture, which has an impact on 
spore dispersal and microclimate both of which contribute to disease 
development. They used a short susceptible (S) cultivar and mixed it 
with either a short or tall resistant (R) cultivar either in 1S:1R ratio or in 
a 1S: 3R ratio (figure 16). 
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Figure 16. Effect of canopy structure on disease development (PUR = pure stand; HET = heter- 
ogeneous material; HOM = homogenous material (redrawn from Vidal et al. 2017). 
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The largest reduction in disease development (39%) was obtained 
when the short susceptible variety was mixed with the tall resistant or 
when it was mixed with the short resistant but in a 1S:3R ratio (30%). 
Mixing the short resistant with the short susceptible in equal proportion 
reduces the disease development by 50%. 

More recently, a significant lower net blotch severity under both 
conventional and organic conditions, and a significant lower powdery 
mildew infection was observed in spring barley populations compared 
with homogenous varieties in Latvia (Loémele et al. 2019). 

The ability of mixtures to control diseases has been found to depend 
on plot size with mixtures planted on a farm-scale giving better control 
than small experimental plots (Gieffers and Hesselbach 1988). Howev- 
er, Newton et al. (2002) found that the percentage of powdery mildew 
reduction in small (6.75 m?) plots was greater than in larger (20 m”) 
plots. A successive study (Newton and Guy 2011) confirmed that mix- 
tures showed either an effect in reducing powdery mildew infection 
compared with the component monoculture mean, or no effect and that 
there was a trend towards a greater reduction at low fertilized levels and 
smaller plot sizes. 

Evolution of resistance to powdery mildew was found in populations 
of bread wheat with the highest level of adult resistance developed 
when the populations evolved in sites where powdery mildew pressure 
is known to be high (Paillard et al. 2000). 

All these studies show that CCs and mixtures are able to evolve to- 
wards a higher yield and a higher level of disease resistance during 
subsequent generations. 


7.2.5. Evolutionary populations and weed control 


The effect of evolutionary populations and mixtures on weed control, 
a problem particularly important in organic agriculture where the options 
to control weeds are limited, has not been widely investigated. Lazzaro et 
al. (2017) investigated mixtures of various diversity levels, both in terms 
of genotypic (1.e., number of cultivars in the mixture) and functional di- 
versity and found that weed biomass was 65% lower for the most diverse 
mixture (both genotypically and functionally), as compared to the aver- 
age of the other entries (1.e., less diverse mixtures and mixture compo- 
nents grown as pure stands) in a year of higher weed infestation. 
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In a comparison along a genotypic diversity gradient made of mixtures, 
normal CCs and male-sterile CCs derived from the same parental mate- 
rial, no difference was found in terms of yield or grain quality, but CCs 
displayed higher plant height and earlier ground cover than the mixtures, 
suggesting a weed competitiveness advantage of CCs (Doring et al. 2015). 

Early vigour and early plant height, likely to be components of early 
ground cover, have been found to be associated with weed competitive 
ability in wheat (Kissing Kucek et al. 2012a). 

Plant height diversity appears as well to be associated with overyield- 
ing (Borg et al. 2018), namely EPs and mixtures being more productive 
than the average of their components grown as monocultures, presum- 
ably in relation to its impact on weed suppression (Kier et al. 2012). 

The practice of using high seed rates in cereal cropping was already 
used before the Green Revolution, but it was reaffirmed with the advent 
of the Green Revolution and industrial agriculture. This was because of 
the new model of short-strawed, high harvest index cultivars, which were 
also less competitive (Lazzaro et al. 2019). Besides, cultivar competitive 
traits were not a breeding priority because herbicides and mechanization 
offered better weed-control (Gallandt and Weiner 2015). 

We will return to the issue of weed control when discussing the ag- 
ronomic management of EPs and mixtures, particularly in relation to 
plant density. 


7.2.6. Evolutionary populations and yield stability 


In discussing about GEI (pg. 33), we mentioned that heterogeneous 
populations can exploit both individual and population buffering and 
therefore are expected to have a higher stability defined, as done earlier, 
as consistency of yield across seasons. 

When discussing yield stability, it is useful to distinguish between 
static stability (also defined as Type I, or homoeostasis or biological 
stability) and dynamic stability (also defined as Type II, or agronomic 
stability [Lin et al. 1986]). According to static stability, a genotype is 
stable when its yield tends to remain the same across years and locations 
(environments); these are usually the genotypes which yield more in 
unfavourable than in favourable environments. According to dynamic 
stability, a genotype is stable when its mean yield across environments 
is parallel to the mean of all genotypes in the experiment. 
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In 1961, Allard published a paper (Allard 1961) describing an experi- 
ment designed to analyse the relationship between genetic diversity and 
stability defined as consistency of performance (namely consistency of 
yield) across environments. He compared 3 lima bean pure line vari- 
eties (P,P, and P,), four mixtures, three of which were made of two 
varieties and one with all three, and three bulk populations obtained by 
crossing P, xP, P,xP, and P,x P, and advancing them to F, or F, gener- 
ation without conscious selection. Therefore, the three groups tepresent 
three different levels of diversity: higher in the bulks, intermediate in 
the mixtures, and lower in the pure lines. 

The ten entries (3 pure lines, 4 mixtures, and 3 bulks) were grown 
in replicated trials (four replications) for four years in four locations in 
Central California. The stability was measured as consistency of ranks 
and as magnitude of variances. 
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Figure 17. Frequency with which the three pure lines ranked first to tenth in yield in 64 repli- 
cates occurring in 4 locations over 4 years (redrawn from Allard 1961). 
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The distribution curves of the ranks in yield in each of 64 environ- 
ments (4 reps x 4 locations x 4 years) — considering every single repli- 
cation as a different environment — of the three pure lines (figure 17) are 
characteristically U-shaped, indicating that they are successful in many 
environments (high frequency of low ranks) and unsuccessful in many 
others (high frequency of high ranks). 

The curves for mixtures and bulks (figure 18), in contrast, tended 
towards normality, indicating that the genetically diverse populations 
were intermediate in any one environment. 


Mixtures 
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Frequency in rank 


Figure 18. Frequency with which the four mixtures (left) and the three bulks (right) ranked 
first to tenth in yield in 64 replicates occurring in 4 locations over 4 years (redrawn from 
Allard 1961). 


The two highest yielding entries were P, (figure 17) and B, (figure 
18, right), but their high mean yield was achieved in a very different 
way: B, ranked first less frequently than P,, but it ranked second, third, 
fourth, and fifth more often than P, and, more importantly, was never 
one of the three lowest. 

This is even clearer in figure 19, where the means of the 4 replicates 
and diversity groups were plotted confirming the U-shaped curve of the 
pure lines and the more normal curves of the mixtures and the bulks. 
Therefore, the genetically diverse populations were more stable and 
consistent than the genetically uniform populations. 
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Figure 19. Frequency with which the pure lines, the mixtures and the bulks ranked first to tenth 
in yield in the 16 combinations of 4 locations and 4 years (redrawn from Allard 1961). 


Information on population buffering in heterogeneous material comes 
from both cross-pollinated and self-pollinated crops. 

Several studies reported a yield stabilizing effect of mixtures com- 
pared to monocultures (Frey and Maldonado 1967; Wolfe 1985; Dubin 
and Wolfe 1994). 

Sprague and Federer (1951) compared double crosses maize hybrids 
with single crosses maize hybrids and found that both GL and GY were 
smaller for double crosses indicating that these were more stable than 
single crosses. Jones (1958) confirmed these results in yield trials and 
found that the coefficient of variation was smaller for double crosses 
(12.3%) than for single crosses (21.4%). 

In winter wheat, yield stability measured by environmental variance 
was found to be higher in EPs than the mean of the parents (Déring 
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et al. 2015). In the same study, an index of yield reliability was used, 
which estimates the lowest yield expected with a given probability cho- 
sen according to the level of risk aversion by farmers: EPs and mixtures 
tended to have a higher reliability index. In other words, they tend to 
have a more reliable yield than the mean of the parents. 

Weedon and Finckh (2019) found in winter wheat that the genetic 
background affects yield stability of EPs: those with a wide genetic 
basis have a better dynamic stability, while those with a narrow ge- 
netic basis tend to have a better static stability. Furthermore, EPs tend 
to have better stability than uniform varieties under industrial agricul- 
tural system. 


7.2.7. Recent developments in evolutionary plant breeding 


More recently, there have been several papers confirming the early 
findings and showing that both natural populations and experimental 
EPs do evolve. 

Examples like those already presented come from wild relatives. 

In Israel, Nevo et al. (2012) examined 10 wild emmer wheat (Trit- 
icum dicoccoides Koern.) populations and 10 wild barley (Hordeum 
spontaneum K. Koch) populations, sampling them in 1980 and again 
in 2008, and performed phenotypic and genotypic analyses on the col- 
lected samples. 
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Figure 20. Differences in flowering time of populations of wild emmer and wild barley collect- 
ed in 1980 and in 2008 from the same locations (redrawn from Nevo et al. 2012). 


They witnessed profound adaptive changes in flowering time of these 
wild cereals in Israel over the 28 years between the two collecting times. 
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As shown in figure 20, in all 10 collection sites for both wild species, 
the populations sampled in 2008 flowered significantly earlier than the 
populations sampled at the same locations 28 years earlier. Most likely, 
a progressive adaptation to the changes in climate was at the basis of 
the adaptive change. 

Vigouroux et al. (2011) performed a detailed study of pearl millet 
in Niger, where the crop is cultivated on 65% of the area. Similarly, to 
the work cited above, they examined samples of pearl millet landraces 
collected from the same villages in 1976 and again in 2003. 

As shown in figure 21, there was a progressive decrease in rainfall 
indicating a significant climate shift to drier conditions over the 50 
years period. 
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Figure 21. Rainfall isohyets for the 1950-1976 and for the 1977-2003 period. The shift in isohy- 
ets 100 to 150 km further south from about 17° latitude to just above 16° illustrates the average 
decrease in rainfall (redrawn from Vigouroux et al. 2011). 
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Figure 22. Flowering time of pearl millet in 79 villages throughout Niger. The varieties col- 
lected in 2003 were earlier than those collected in 1976 (redrawn from Vigouroux et al. 2011). 


Among other traits, the landraces were examined for flowering time, rep- 
resented in figure 22 with circles: the smaller the circle, the earlier the flow- 
ering time. After making sure that there was not a shift in landraces, the data 
showed that the earliness increased significantly from the samples collected 
in 1976 to those collected in 2003. After excluding the effect of genetic drift 
and of sampling, the authors concluded that recurrent drought led to selec- 
tion for earlier flowering in one of the major crops in the Sahelian region. 

Goldringer et al. (2006) showed that EPs can rapidly adapt to differ- 
ent geographical areas by changing their phenology. 

An EP of bread wheat was developed by crossing two by two 16 
parents obtaining 8 F, hybrids, which in turn were crossed in what is 
known as a pyramidal design (Thomas et al. 1991) as shown in figure 
23 to obtain the evolutionary population called PAO. 
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Figure 23. Development of an evolutionary population by crossing 16 bread wheat varieties 
using a pyramidal design. The resulting population (PAO) evolved for 10 generations in seven 
locations in France under intensive and extensive management (redrawn from Goldringer et 


al. 2006). 
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Figure 24. The locations in France where the population evolved (to the left) and the heading 
time of the populations which evolved in three contrasting locations (redrawn from Goldringer 


et al. 2006). 
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The PAO was grown in seven locations, and in some locations both 
under intensive and extensive management. 

Figure 24 shows that, after 10 generations, the population that 
evolved under extensive management in Le Moulon, Northern France 
(LM,,) was significantly later heading than the population that evolved, 
also under extensive management, in Toulouse, Southern France (TO,) 
with a much warmer climate. 

Danquah and Barrett (2002) evaluated three generations from each of 
three populations of the barley CCV maintained in Cambridge since 1977 
and used by Ibrahim and Barret (1991) (see page 50) for yield in two 
years, 1991 and 1992, using the variety Atem as a control. The most re- 
markable result was that Atem was the highest yielding in 1991 (table 3) 
while in 1992 three composites (IF,,, IIIF,,, and IIIF,,) out-yielded Atem. 

The authors attributed this change in ranking to a long period of drought 
which in 1992 may have affected Atem more than the CCV derived popu- 
lations. Atem is a cultivar selected for high input conditions and therefore 
the results were not surprising. This experiment is additional evidence 
that EPs are generally more stable than commercial cultivars. 


Table 3. Mean grain yield (g/plot) of Atem and a range of genera- 
tions from three CCV populations’ grown in the field over two years 
(redrawn from Danquah and Barrett 2002). 


Cultivar/CCCV 1991 1992 2-year mean % of Atem 
Atem 2466.1 716.8 1591.4 100.0 
LF 5 945.2 722.9 834.1 42.4 
IF, 928.9 102.6 515.8 32.4 
IE, 1184.5 451.5 817.9 51.4 
IIF,, 1247.7 393.9 828.8 51.6 
IIF,. 1240.4 301.3 770.9 48.4 
IIF,, 1463.8 426.9 945.4 59.4 
TIF,, 1678.8 785.5 1232.2 774 
IIF,, 1712.8 483.9 1098.3 69.1 
IUF,, 1479.2 1012.7 1245.9 78.3 
Mean of CCCV 1320.1 520.1 920.1 57.8 


“J, IL and III are three different populations of the CCV 
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More recently, Raggi et al. (2016a, 2016b, 2017) developed a barley 
composite cross (named AUTDBA) by crossing in all possible com- 
binations seven F, derived from seven crosses between cultivars, lan- 
draces, and promising lines. The population was grown for 9 years at 
one location under low input conditions. The population was then sub- 
mitted to artificial selection and a new population (named MIX48) was 
developed by mixing the highest yielding and most diverse lines. The 
selection also generated 13 pure lines. AUTDBA, MIX48 and the 13 
lines were evaluated for four successive years in MET carried out under 
different pedo-climatic conditions and management systems (organic 
and low-input) using five commercial varieties and five breeding lines 
as controls. 

The location’s mean yield ranged from 2.5 to 5.5 t h'. The interac- 
tions between genotypes and populations and environments were ana- 
lysed with a GGE biplot (R Core Team 2015). In the biplot (figure 25), 
the green line, called the mean environment axis passes through the 
biplot origin and a second line (in red in figure 25) is perpendicular to 
the mean environmental axis (see also the Experimental designs and 
Statistical analysis section). The projections of the genotypes tested in 
the experiment on the mean environment axis approximate their mean 
yields while the projection to the perpendicular axis approximates the 
GE] associated with the entry. The genotypes or environments located 
near the centre of the axes are characterized by high stability; genotypes 
at the same distance from the centre of the axis have similar stability; 
closeness between genotypes and environments are directly related to 
their degree of interaction. The circle indicated the ideal genotype while 
the arrow is in the direction of higher yields. 
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Figure 25. Biplot of grain yield of 24 barley entries in eight environments in Central Italy (A breeding 
lines; Ii commercial varieties; Ml pure lines selected from the evolutionary population) (redrawn from 
Raggi et al. 2017). 


The distance from the green vector along the red vector is a measure of 
stability: the closer are the lines to the red vector, the more stable they are. 

The highest yielding line was one of the breeding lines, which was 
however very unstable as indicated by its distance from the green line. 
Three lines selected from the EP (EP lines) as well as the MIX48 were 
not significantly different in yield from the highest yielding breeding 
lines but were much more stable. 

In the high-productive trials (> 3,000 kg ha‘'), the mean yields of the 
four groups (EPs, EP lines, commercial varieties, and breeding lines) 
were not statistically different (figure 26) while in the low productive 
trials, the EPs, as well as the EP lines, yielded as well as the commercial 
varieties and significantly more than the average of the breeding lines. 
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Figure 26. Histograms of the mean grain yield of EPs, lines selected from EPs (EP lines), 
commercial varieties (Comm. varieties) and breeding lines (Breed. lines) in high- and low-pro- 
ductive trials (left and right part, respectively). Histograms with different lowercase letters are 
different according to LSD test (p < 0.05) (redrawn from Raggi et al. 2017). 


A meta-analysis of 91 studies and more than 3,600 observations con- 
cluded that cultivar mixtures are a viable strategy to increase diversity 
in agro-systems, increasing yield and yield stability as well as disease 
resistance (Reiss and Drinkwater 2018). 

In this study, relative yield (RY) was used to compare varieties in 
monocultures and mixtures with RY>1 indicating a yield benefit from 
mixing, RY<1 indicating a yield penalty from mixing, and RY=1 indi- 
cating no change from mixing compared to monoculture. 

Interestingly, 80% of the studies were on small grain (43% on wheat, 
20% on barley, 17% on oat) while soybean and maize accounted for 
10.5% and 3.5% of the observations. Most experiments were conducted 
in North America (80%) with Europe (7.8%) and Asia (6.8%) making 
the next two largest groups. 

Overall, mixtures yielded 2.2% more than expected based on their 
monoculture’s yields (figure 27). There was a significant increase for 
all the crops except for sorghum (figure 28). Fourteen percent of the 
mixtures had RY increases greater than 10%. 

The fact that corn mixtures have the largest increases of RY is not 
surprising given that corn is a cross-pollinated crop and therefore even 
a mixture after one cropping season becomes an EP (as discussed when 
comparing static with dynamic mixtures). 
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Figure 27. Distribution of relative yield (RY) in cultivar mixture experiments. Those to the right 
of the broken line are higher than 1, those to left are lower than 1. The mean is 1.0217 (redrawn 
from Reiss and Drinkwater 2018). 


v 

Corn (125) u 

Legumes (101)-| i 

Wheat (1563) H 
Oat (614), H eu 

1 

' 

1 

1 

1 


fon] 
s 
sc 
Barley (732) -o— 
Soybean (380) es 
Sorghum (67)— —__or 
T T T T \ 1h 
0.98 1.00 1.02 1.04 1.06 1.08 1.10 1.12 


Figure 28. Relative yield (RY) in cultivar mixture experiments disaggregated by crops (num- 
ber of mixture observations in parenthesis). The legume group includes common bean, com- 
mon vetch, cowpea, field pea, moth bean, all represented by one study each. The broken line 
is RY =1 (redrawn from Reiss and Drinkwater 2018). 


The studies used in the meta-analysis included mixtures of different 
composition. As shown in figure 29a, the simple mixtures (two or three 
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components) had a mean RY of 1.02 while the more complex mixtures 
(four or more components) had a much higher mean RY (RY ~ 1.05). 
Also, the mixtures constructed with a specific basis for selecting the 
components had a higher RY compared with those where the rationale 
was not stated (figure 29b). 

Mixtures planned on a combination of both physical characteristics 
and disease resistance had a significantly higher RY than those mixtures 
based on either a physical or a disease basis alone (figure 29c). 

The RY of mixtures was higher in presence of abiotic stresses such 
as low pH (figure 30a) and absence of fertilizer (figure 30b) while no 
difference was detected associated with water availability (figure 30c). 

However, ecological evidence suggests the species mixtures could 
perform better in the presence of stresses such as drought (Holmgren 
and Scheffer 2010). This is supported by other studies, which have 
shown that the yield advantage of mixtures increases when the environ- 
ment becomes more stressed (Frey and Maldonado 1967; Doring et al. 
2010) as we have seen earlier in the case of the experiment of Danquah 
and Barrett described in table 3. 
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Figure 29. Relative yield (RY) in cultivar mixtures affected by their composition (number of 
mixture observations in parenthesis). The broken line is RY =1 (redrawn from Reiss and Drink- 
water 2018). 
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Figure 30. Effect of environmental stresses on RY of mixtures experiments disaggregated by 
crops (number of mixture observations in parenthesis). The broken line is RY=1 (redrawn from 


Reiss and Drinkwater 2018). 
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Figure 31. Yield stability of monocultures and mixtures over time and over space (redrawn 
from Reiss and Drinkwater 2018). The CV of bars labelled with different letters are signifi- 
cantly different. 
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As found in many other studies, mixtures grown in environments 
with high disease pressure had a higher RY than those grown under 
conditions with little or no disease pressure (figure 30d). 

Confirming many of the earlier studies on both mixtures and popu- 
lations, this study also shows that compared to mixtures, monocultures 
tended to have lower stability, as measured by the average coefficient of 
variation (CV) of yield (figure 31). However, in these studies the higher 
stability of mixtures was significant over time but not over space, name- 
ly over different locations in the same year. 


7.3. An ecological view of evolutionary populations 


Ecologists recognize that biodiversity can strongly affect ecosys- 
tem functioning and have shown that primary production, namely the 
production of organic compounds through photosynthesis, and total 
plant biomass increase with species richness. In theoretical ecology, 
diversity has an important role in the functioning and resilience of 
ecosystems (Litrico and Violle 2015). As we have seen at the begin- 
ning of this manual and despite the demonstrated benefits of biodi- 
versity, agriculture has historically and progressively reduced species 
richness and genetic diversity within species while increasing yields 
under high-input conditions (Barot et al. 2017). Twenty major crops 
(from a total of 2,500 domesticated plant species) currently cover 
44% of arable land (Leff 2004). 

The ecological mechanisms, which explain the positive effect of 
within-species diversity, such as the diversity of mixtures and EPs, are 
extrapolated from the effects of “species richness’, as defined by ecol- 
ogists, by replacing “species” with “varieties or genotypes” in the case 
of mixtures and EPs. This extrapolation is possible if the components 
of the mixtures and EPs are functionally diverse: for example, if they 
differ in plant height, days to heading, rooting depth, reaction to dis- 
eases, tolerance to abiotic stresses, etc. Therefore, making a mixture of 
modern varieties will elicit marginally positive effects as the level of 
diversity will be very modest. 

The ecological mechanisms, which explain the positive effect of 
within-species diversity, hence the advantages of EPs and mixtures, are 
known as the sampling effect (Barot et al. 2017), and have also been de- 
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fined as compensation by Do6ring et al. (2011) and as complementarity 
effects (Barot et al. 2017). 

These two effects are related to the two components of fitness, defined 
as the reproductive success of a population or of a specific genotype 
(Watt 2001), namely environmental fitness and selfish fitness. Environ- 
mental fitness refers to an individual well-adapted to its own growing 
environment (soil, climate, presence of pests, drought) while selfish fit- 
ness refers to a plant possessing competitive traits such as early vigour, 
tallness, and tillering capacity (Lazzaro et al. 2019). These traits allow 
a genotype to use resources better and faster than its neighbours. The 
distinction is important because evolution guided by higher selfish fit- 
ness, being blind to overall plant performance, may in some cases be 
antagonistic to overall crop performance and stimulate complementary 
effects in others (Beaugendre et al. 2022), depending on what defines 
the fitness advantage of a competitive interaction (Weiner et al. 2017). 

Sampling effects, also called selection effects, are expected to lead 
to stability of production across environments in both time and space 
as well as overyielding (defined at page 57), while complementarity 
effects can be expected to lead to overyielding, and even to transgres- 
sive overyielding (i.e., EPs and mixtures being more productive than its 
most productive individual component grown as a monoculture) (Barot 
et al. 2017), or enhanced yield-sustaining benefits, such as the stimula- 
tion of ecosystem services. 

Sampling effects, as described by ecologists, are useful in the agricul- 
tural context when a) environmental conditions (climate, availability of 
nutrients) vary in time and space, b) the components of the EPs and the 
mixtures respond differently to these fluctuating environmental condi- 
tions, and c) the best adapted components of the EPs and the mixtures 
become more frequent. 

Complementarity effects are those deriving from a complementary 
use of resources in space and time. This type of effect explains the re- 
duction of the impact by pests in mixtures and EPs described earlier, 
and also those that allow mixtures and EPs to better exploit resources 
such as water, mineral nutrients, or light, which would in turn allow 
mixtures and EPs to have a higher productivity or resistance to envi- 
ronmental hazards than fields planted with a single variety (Barot et al. 
2017). There is some evidence of complementary effects due to different 
rooting depth that allow maximum nutrient acquisition (Fargione and 
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Tilman 2005). Li et al. (2014) found evidence of complementary effects 
expressed as the ability of some crops, such as faba bean, lupine, pigeon 
pea, chickpea, and peanut, to mobilize nutrients such as phosphorus (P), 
and micronutrients such as iron (Fe), zinc (Zn) and manganese (Mn), 
thus improving nutrition for themselves and their neighbours unable to 
mobilize nutrients. 

Complementary genotypes would occupy slightly differentiated 
ecological niches, and thereby the whole crop a larger ecological 
niche, which could result in improved resource use, reduced intraspe- 
cific competition and increased competitiveness against weeds (Bar- 
ot et al. 2017). 


7.4. The science of evolutionary plant breeding: Conclusions 


A wealth of genetic, ecological and evolutionary research, spanning 
over several decades, has shown that reincorporating diversity into 
agro-systems to promote ecosystem services is one viable approach for 
reducing environmental impact while maintaining and even increasing 
yields (Kremen and Miles 2012). 

This large wealth of research has shown that: 

1. Evolutionary populations and mixtures are able to adapt their 

phenology to the location in which they are multiplied; 

2. They evolve becoming more and more productive; 

3. Evolutionary populations, and to a lesser extent mixtures, have a 
more stable yield over time than uniform varieties but not over 
space, i.e., they become specifically adapted; 

4. Evolutionary populations and mixtures evolve, becoming more 
and more resistant to diseases; 

5. Evolutionary populations and mixtures control weeds better than 
uniform varieties even though the scientific evidence on this 
advantage is still limited. 

From all these demonstrated advantages it is possible to extrapolate 
that EPs and mixtures are able to slowly adapt to climate change as 
long as they have, and are able to maintain, a large genetic diversity, 
and therefore the way they are managed by farmers is important. EPs 
and mixtures, with their capacity to evolve in response to both biotic 
and abiotic stresses, are likely to be the quickest, most cost-effective, 
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evolving solution to such a complex and evolving problem as climate 
change, with the additional advantage of increasing yield gains result- 
ing from a combination of natural and artificial selection (Ceccarelli 
et al. 2010; Murphy et al. 2013; Ceccarelli and Grando 2020a). 

The speed of adaptation depends also on the type of selection. 

Several field studies of natural populations of plants and animals 
have shown abundant evidence for directional selection on morphology 
and life history traits (number, size and sex ratio of offspring, the timing 
of reproduction, age and size at maturity and growth pattern, longev- 
ity) (Kingsolver et al. 2001). These studies have shown that tempo- 
ral fluctuations in magnitude and direction of directional selection are 
common (Siepielski et al. 2009) and that the magnitude of directional 
selection is sufficient to produce rapid microevolutionary changes in 
many populations. 

Directional natural selection is forecasted to increase with more fre- 
quent droughts and rising temperatures (Exposito-Alonso et al. 2019). 

Despite all the potential benefits of mixtures and EPs, for many years 
the only example of practical exploitation remained the one on malting 
barley mixtures by Martin Wolfe described earlier (page 9). The ma- 
jor obstacle to translating into practice the several findings of research 
about the benefits of mixtures and EPs was likely the seed legislation, 
which in several countries is based on the principles of variety regis- 
tration and seed certification. Seed can be marketed only if it belongs 
to a variety that has been registered and the seed has been certified. A 
variety must satisfy distinctness, uniformity, and stability requirements 
(DUS) (Winge 2015). 

However, the European Union, recognizing the advantages of EPs 
and mixtures, has invested considerably in research on diversity in ag- 
riculture over the last 10 years. This research was conducted in pro- 
jects like “Strategies for Organic and Low-Input Integrated Breeding 
and Management” (SOLIBAM) from 2010 to 2014, “Embedding Crop 
Diversity and Networking for Local High Quality Food Systems” (DI- 
VERSIFOOD) from 2015 to 2019, “Boosting Organic Seed and Plant 
Breeding across Europe” (LIVESEED), and eventually with the new 
EU “Farm to Fork” initiative. 

In 2014, the European Commission, recognizing the contradiction 
between funding research projects to promote the use of diversity in 
agriculture and the seed laws that hinder it, issued an “Implement- 
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ing decision” with which a “temporary experiment at Union level” 
was organized “for the purpose of assessing whether the produc- 
tion, with a view to marketing, and marketing, under certain condi- 
tions, of seed from populations...... belonging to the species Avena 
spp., Hordeum spp., Triticum spp., and Zea mays L., may constitute 
an improved alternative to the exclusion of the marketing of seed 
not complying with the requirements.....”. The experiment initially 
planned to end on 31‘ December 2018, was eventually extended to 
28 February 2021. 

Furthermore, in 2018, the European Commission adopted a new 
Regulation (EU) 2018/8487, which lays down new rules on the con- 
trol of organic production and labelling of organic products. It makes 
available for use in organic production plant reproductive material 
that does not belong to a variety, but rather belongs to a plant group- 
ing within a single botanical taxon with a high level of genetic and 
phenotypic diversity between individual reproductive units. In the 
Regulation this material is defined as heterogenous material, which: 
a) presents common phenotypic characteristics; b) is characterised by 
a high level of genetic and phenotypic diversity between individual 
reproductive units, so that the plant grouping is represented by the 
material as a whole, and not by a small number of units; c) is not a 
variety within the meaning of Article 5(2) of Council Regulation (EC) 
No 2100/94 (1); d) is not a mixture of varieties; and e) has been pro- 
duced in accordance with this Regulation. 

In 2011, the International Fund for Agricultural Development 
(IFAD), based in Rome, funded a project “Using Agricultural Biodi- 
versity and Farmers’ Knowledge to Adapt Crops to Climate Change 
in Iran” with which EPB on wheat and barley was initiated in Iran by 
the International NGO CENESTA (Ceccarelli et al. 2022). As a fol- 
low-up, in 2018, IFAD funded the four-year project “Use of genetic 
diversity and Evolutionary Plant Breeding for enhanced farmer resil- 
ience to climate change, sustainable crop productivity, and nutrition 
under rainfed conditions”, implemented by Bioversity International 
and involving Uganda and Ethiopia in Africa, Jordan and Iran in the 
Middle East, and Nepal and Bhutan in South Asia. As part of this pro- 


2 https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32014D0150 
3 https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32018R0848 
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ject, a number of EPs and mixtures of wheat, barley, rice, and beans 
were tested in several villages in the six countries with the participa- 
tion of farmers. 


8. 
HOW TO CONSTITUTE 
AN EVOLUTIONARY POPULATION 


The farmers can make mixtures and/or populations using their own 
varieties or those available on the market. In this case, a distinction 
between populations and mixtures has to be made also in relation to the 
mating system of the crop. As we said earlier, farmers can mix the seed 
of pure line varieties of a self-pollinated crop, making a mixture. If they 
intend to make a static mixture, they have to mix the seed of the same 
varieties in the same proportion (or number of seeds) before planting 
every year; if they intend to make a dynamic mixture, they will use as 
seed part of the harvest at each planting, relying on the effect of natural 
selection (refer to the experiment of Harlan and Martini — Table 2) and 
on the small percentage of natural cross-pollination that always occurs 
in these species to become a population with limited opportunities for 
recombination. 

To make evolutionary populations of self-pollinated crops, farmers 
have two options: either they became familiar with the crossing tech- 
nique of the crop(s) in which they are interested, or they rely on a re- 
search institute to provide them with the mixture of segregating popu- 
lations obtained from the crosses of their interest. On the other hand, 
in a number of self-pollinated crops or vegetatively propagated crops, 
a number of varieties are F, hybrids. Mixing the seed of a number of 
different hybrids is actually making EPs with large opportunities for 
recombination. The opportunities for recombination are even larger if 
the F, hybrids belong to a cross-pollinated crop. 

In deciding whether to make mixtures or populations, one should 
keep in mind the experiment of Patel et al. (1987), described earlier, 
which shows that the evolution of mixtures is slower than the evolution 
of populations. 

There are two contrasting philosophies guiding the development 
of either a mixture or an EP: one view is that mixtures and EPs 
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should be constructed after a careful choice of the components (in 
the case of mixtures) or the parents (in the case of EPs) (Barot et al. 
2017; Montazeaud et al. 2020) in such a way as to have a particu- 
lar combination of traits that is associated with a particular set of 
agro-ecosystem services. 

A second view, based on several studies, particularly by ecologists, 
is to develop mixtures and EPs in such way as to maximize genetic 
diversity. The beneficial effect of diversity on ecosystem services was 
recognized since last century, and the recent interest is due to the in- 
creased awareness that the loss of biodiversity could affect ecosystem 
functioning and therefore the services they provide to humans (Isbell et 
al. 2011; Hooper et al. 2012). Ecologists have shown that a) the several 
functions of ecosystems require a large number of species (Hector and 
Bagchi 2007), b) the minimum required species richness consistently 
increases with the number of ecosystem functions considered (Zavaleta 
et al. 2010), and c) biodiversity stabilizes ecosystem productivity and 
productivity-dependent ecosystem services by increasing resistance to 
climate events (Isbell et al. 2015). Figure 32 summarizes the findings 
of these ecological studies, showing clearly that the higher the number 
of services we expect to be delivered by the ecosystem, the higher the 
number of components of the mixture. 


Sum of all services 
provided by the mixture 


N services 

4 services 

3 services (e.g. yield + biological 

pest control + maintenance of soil fertility) 
2 services (e.g. yield + biological 

pest control) 

1 service (e.g. yield) 


Number of varieties 


Figure 32. Expected effects of the number of varieties of a mixture on the total production of 
services when a single or several services are considered (redrawn from Barot et al. 2017). 
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The main limitation of these studies is that they refer only to static mix- 
tures and neither to EPs nor to dynamic mixtures that, as we have seen 
earlier, have a high evolutionary potential. Therefore, their genetic compo- 
sition evolves, because new genotypes are being continuously generated 
by recombination and exposed to natural selection, and possibly to human 
selection, as we will see later. This gives mixtures and EPs the unique char- 
acteristic of possessing a large genetic diversity, which is renewed at each 
generation, and which can only decrease as a consequence of intense direc- 
tional selection acting in the same direction in successive cropping seasons. 

This continuous changing and evolving genetic diversity implies that 
EPs and mixtures will likely generate genotypes with novel functional 
traits to increase the population fitness. 

The choice of how many or which varieties to mix or to cross also de- 
pends on the farmers’ objectives and on the characteristics of the crop. 
For example, if disease resistance is one of the problems affecting pro- 
ductivity in the target environment(s), one or more parents of the EP or 
one or more varieties in the mixture should carry the desirable genes for 
resistance to the targeted disease(s). 

Similarly, when weeds are a major problem, it is advisable to use 
parental material with good early vigour (Kissing Kucek et al. 2021b). 

Natural selection is not effective in the case of quality traits (such as 
taste, cooking qualities, flavour or nutrient content), so if quality traits 
are desired, these traits should be present in the parents of the EP or of 
the mixture and the quality should be checked year after year to ensure 
that it is maintained. 

There is some evidence in Italy from two Cooperatives, la Terra e il 
Cielo (www. laterraeilcielo.it/en/) and Rocca Madre (www.roccamadre. 
it), and from a number of farmers all practicing organic agriculture, pro- 
ducing commercially successful bread and pasta with EPs of bread and 
durum wheat, respectively. These EPs have good grain quality because 
the parents of the respective EPs originate from breeding programs 
where grain quality was an important breeding objective. Data from 
baking tests show that EPs specifically created for good baking quality 
are as good in terms of baking volume or better (for example protein 
content and Hagberg falling number) than modern elite wheat varieties 
(Brumlop et al. 2017). 

In crops such as bean, chickpea, lentil, rice, and others where cook- 
ing time can differ considerably between varieties, it is necessary that 
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the components of mixtures have similar cooking time. In a crop like 
rice where the combination of grain width and grain length classifies 
types with different commercial value and destination, grain character- 
istics must also be considered in the selection of the components of EPs 
and mixtures. The point, already made by Simmonds (1962), is that the 
need for uniformity in few traits does not have to be extended to every 
other trait of the crop. 

The increasing availability of genetic markers associated with the 
desirable genes will make the handling of EPs and mixtures even more 
effective, thus combining what is commonly referred to as “modern 
science” with the deployment of diversity. 

Recently, Merrick et al. (2020) suggested the creation of different 
populations for different target environments characterized by differ- 
ent rainfall showing that the performance of EPs obtained by just two 
parents (biparental EPs or BPPs) and CC was significantly affected by 
their pedigrees. Furthermore, they performed better in the environment 
in which they were developed for and, as shown by several other stud- 
ies (e.g., those of Patel described earlier), they showed an increase in 
stability and performance with no differences between BPPs and CC. It 
should be noted that in this context “environment” does not necessarily 
mean the same location, as different locations could be characterized 
by similar selection pressure and therefore have similar effect on the 
evolution and performance of an EP or mixture. 


0 
HOW TO USE THE EVOLUTIONARY POPULATIONS 


9.1. The evolutionary population as the farmers’ crop 


The simplest and cheapest way of implementing evolutionary plant 
breeding (EPB) is for the farmers to plant and harvest in the same loca- 
tion (figure 33) without any intervention. The population will be planted 
and harvested, becoming the farmer’s crop. This is how several farmers in 
Italy and Iran use evolutionary populations. This implies that farmers will 
produce their own seed, an issue that we will discuss in detail on page 105. 

However, for the reasons mentioned above, this is possible only with 
some crops. For example, in the case of EPs of bread wheat and durum 
wheat in a number of countries, once the grains of the populations had 
been transformed into bread and pasta, respectively, they had an unex- 
pected commercial success. The same is happening in some countries 
with vegetables such as tomatoes or zucchini, or legumes such as beans. 


Original 
Population 


f 


Dry and Hot sites 


Organic 
conditions 


Figure 33. The evolutionary population is planted and harvested in each of many sites (here 
only five are shown as an example) as the farmer’s crop. During the process farmers can share 
part of the seed with other farmers who plant the population under their own conditions and/or 
use the population for selection as described later. 
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As the population is planted in locations affected by different 
stresses or different combinations of stresses, the population will 
become progressively better adapted to those stresses or combina- 
tions of stresses, and therefore will break down in different sub-pop- 
ulations. We included in the figure 33 organic conditions because 
organic agriculture, particularly in the Mediterranean countries, rep- 
resents, from a plant breeder’s perspective, a heterogeneous TPE, 
fundamentally different from the more homogenous TPE typical 
of industrial agriculture. In the latter, the use of chemicals such as 
pesticides (fungicides, insecticides, and herbicides) and fertilizers 
has a powerful effect in smoothing most of the differences leaving 
only those that are largely unpredictable, due to climate. To serve 
such a heterogeneous population of environments, characterized by 
different climates, soils, landscapes, organic practices, clients and 
markets, a highly flexible and dynamic breeding strategy such as 
EPB is needed, fundamentally different from, and not compatible 
with, corporate breeding. This will be further discussed under “Seed 
production of evolutionary populations”. 

Once the farmer has satisfied her/his needs such as having enough 
seed for planting the following cropping season, feeding his livestock, 
etc., s/he may sell or donate part of the seed to one or more neighbours 
who can start their own EP to be handled in the same way. 

It is suggested that at each cycle, each farmer stores a sufficient 
amount of seed (the amount depends on the crop — in the case of bread 
and durum wheat, about 30,000 seeds — but should be enough to contain 
as much diversity as possible). In the case of catastrophic events which 
after n years of evolution lead to the complete loss of the population, us- 
ing the remnant seed will avoid losing all the benefits of the adaptation 
accumulated in n years by going back to the population as it was after 
n-I years of evolution/adaptation. 

One obvious advantage of the possibility of using an EP as the 
farmer’s crops is that, in addition to all the other advantages shown 
by nearly 90 years of research, the farmers will produce his/her own 
seed. Besides ideological motivations, the biological motivation is 
that there cannot be better seed than the one produced by a population 
specifically adapted to the soil, climate, and agronomic practices of 
that farmer or group of farmers. 
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9.1.1. How quickly does an evolutionary population become adapted? 


Evolutionary theory tells us that the speed of adaptation depends on 
the initial amount of genetic diversity and on the intensity of selection 
as indicated also by the breeder’s equation (pg. 31). In this context, 
experimental evidence tells us that “populations” as defined earlier do 
evolve more rapidly towards better performance than “mixtures” (AI- 
lard and Jain 1962; Khalifa and Qualset 1975) and therefore it is im- 
portant to use these two terms properly and not as near synonyms as 
frequently happens. This was shown clearly in the experiment of Patel 
et al. (1987) described on page 44, and it is also shown in an experi- 
ment conducted on the nematode Caenorhabditis elegans (a free-living, 
transparent nematode, about 1 mm in length), by Morran et al. (2009). 
The populations of this nematode are composed of hermaphrodites and 
males: hermaphrodites can reproduce by selfing or by outcrossing with 
males. In natural populations (wild type), outcrossing occurs with a fre- 
quency of <5%. Notice that this corresponds to the situation in most 
self-pollinated crops. 

In the experiment, two mutants were used, with a frequency of out- 
crossing = 0 (obligate selfing) and = 100 (obligate outcrossing), re- 
spectively. The two mutants (0 and 100 outcrossing) and the wild type 
(<5% outcrossing) were exposed to a virulent pathogen, which caused 
80% mortality thus imposing a severe selection pressure on the nema- 
tode. After 40 generations (the generation time of the nematode is 2-3 
weeks), the population with 100% outcrossing adapted very rapidly to 
the new conditions (the presence of the pathogen) while the population 
with 100% selfing did not. Of particular interest for a possible extrap- 
olation to the management of EPs of self-pollinated crops was that the 
original population was able to adapt slowly and evolved a higher out- 
crossing rate. This could indicate that EPs of self-pollinated crops could 
generate additional diversity by a higher frequency of cross-pollination 
when grown under stress conditions. 


9.1.2. Does an evolutionary population lose diversity with time? 
The answer to this question depends on the amount of genetic var- 


iability in the original population, the heritability of the traits under 
selection, and on the type of selection. 
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With a modest initial genetic diversity and a strong directional se- 
lection, i.e., continuous selection always in the same direction, we can 
expect a rapid decline of diversity. 

It is also possible to have a strong and sudden decline of diversity if 
there is an exceptional event such as an extreme climatic condition with 
few plants surviving. One strategy to avoid this, as mentioned earlier, is 
to always keep a certain amount of seed in a cold and dry place; in fact, 
if a catastrophic event happens after, let’s say, nine years of evolution, 
going back to the stored seed allows losing only the last year of evolu- 
tion, as indicated earlier. 

In the case of a catastrophic event, the few surviving plants need to be 
selected and possibly maintained as a separate population. 

If we start with a population or a dynamic mixture with a large amount 
of genetic diversity, this will be maintained for a long time, particularly 
because directional selection seems to become highly improbable given 
the short-term weather variability so common in recent times. From this 
point of view, the increased year-to-year variability in whether condi- 
tions should be considered our ally rather that our enemy. 

Eventually, seed exchange between neighbouring farmers growing 
the same EP and sharing similar agronomic and climatic conditions 
could be an important mechanism to maintain diversity. 


9.2. Selection within evolutionary populations 


Because of several differences in the mating systems, largely depend- 
ent on floral biology, it is essential to become familiar with the repro- 
ductive systems before embarking on selection within EPs. 

Very few plants are dioecious (pistachio, kiwi, asparagus, spin- 
ach, dates), namely with the female and male organs on different in- 
dividuals and therefore with the presence of male and female plants. 
The majority of plants are monoecious, i.e., the female and male 
organs are on the same individual; they can be within the same flow- 
er (examples are wheat, barley, rice, beans, tomato, eggplant) or in 
different parts of the same plant, the most well-known example of 
which is maize, where the male flower is on the top (tassel) of the 
plant and the female (one or more) is on the side of the plant (the 
cob that we eat). 
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Actually, in the case of maize, wheat, barley, and rice, we should 
talk about inflorescences, because the tassel, the cob, the spike, and the 
panicle are made of several individual flowers. 

Most of the crops with the female and male organs in the same flower 
are self-pollinated with a certain percentage (usually 1-5%) of cross 
pollination depending on the genotype and the environment. 

The breeder and the farmers (both men and women) can super- 
impose artificial selection with criteria that may change from loca- 
tion to location, from crop to crop, and in the same location and in 
the same crop, with time. While the population is evolving, lines or 
sub-populations can be derived by collecting spikes, panicles, pods, 
berries, cuttings, etc., depending on the crops. The lines or sub-pop- 
ulations can then be tested as pure lines (in the case of self-pollinat- 
ed crops), clones (in the case of vegetatively propagated crops) or 
populations (in the case of cross-pollinated crops) in participatory 
breeding programs, or can be used as multi lines, or a subsample of 
the population can be directly used for cultivation. The key aspect 
of the method is that, while the lines are extracted, the population 
is left evolving for an indefinite amount of time, thus becoming a 
unique source of continuously better-adapted genetic material di- 
rectly in the hands of the farmers. 

Specific cases where intervening with artificial selection is highly 
desirable are those of plant height and maturity. 

As shown by several experiments, when in competition, tall plants 
dominate short plants, which tend to disappear from the population. 
However, several farmers have observed in cereals like wheat that a 
proportion of short plants in an otherwise tall population increases 
lodging resistance without reducing, and improving, the positive ef- 
fect of a tall population in controlling weeds. We have also observed 
(pg. 55, Vidal et al. 2017) that a canopy made of tall and short plants 
with different levels of disease resistance slows down the disease 
spreading. 

While the problems associated with differences in maturity within 
an EP can be reduced by a careful selection of the parents, diversi- 
ty in maturity can be highly beneficial in stabilizing yields across 
seasons because early types can escape drought, while late types 
benefit more from wet seasons. The research cited earlier (Fletch- 
er et al. 2019) has shown that wheat cultivar mixtures made with 
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components differing in maturity time are able to stabilize the com- 
bined risks of frost, heat, and drought stress in frost and heat-prone 
dryland environments, providing growers with a strategy to manage 
these uncertain risks. 

If a farmer is in an area with frequent droughts, it may be also advis- 
able, before harvesting, to select as many early spikes as possible, fol- 
lowing one of the methodologies described below. This will allow es- 
tablishing an early subpopulation, which on the one hand will increase 
farm agrobiodiversity, and on the other hand ensures some income even 
in very dry years. 

With suitable modifications, this can be applied to every crop. 


9.2.1. Selection within evolutionary populations of self-pollinated crops 


9.2.1.1. Spike selection 

If the EP is considered either as a source population for selection, or 
part as a source population and part as a crop, the area where selection 
will be practiced should be planted in a field, chosen according to the 
criteria discussed earlier, with alleys to avoid damaging the crop and 
to allow an equal probability of selecting spikes from any plant in the 
population regardless of its position (figure 34). 

The number of strips can vary depending on the land available and 
can be just only one plot. However, the area to be used for selection 
should be planted with a sufficient number of seeds to be safely consid- 
ered as a random sample of the EP. 

The figure refers to crops such as small grain cereals, which are usu- 
ally planted as solid stand. For crops planted as spaced plants (e.g., 
maize and many horticultural crops), selection can be done without any 
special spatial arrangement. 

In dry areas and/or in dry years, we expect that an empty area, no 
matter how large, such as the one in figure 34, might create border ef- 
fects, namely the plants at both sides of each strip, at the beginning, and 
at the end may be taller and may remain green longer as they have less 
competition and therefore more light, more nutrients and more water 
available than the plants at the center of the strip. 
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Figure 34. The evolutionary population (or part of it) is planted in strips (the spacing suggested 
is in the case of a cereal crop such as barley, wheat and rice). The farmers and scientists can 
walk in the 80 cm path and reach the spikes or the panicles they intend to select within the 3m 
wide plots. The black plots are planted with the farmer’s variety as a reference during selection. 


Figure 35 (right), shows the extent of the border effect due to an emp- 
ty alley between plots, and the absence of border effects without alleys 
in the case of barley (figure 35 left). 

In the case of a situation like the one at the right of figure 35, the 
selection of spikes should avoid the plants growing at the border even 
if they look very attractive, because their attractiveness is most likely 
only due to their favourable position in the plot. 


Figure 35. Absence of border effects in a barley field trial with no alleys (left) and presence of 
border effects in a barley field trial with alleys (right). 
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Before going into further details of spike selection, we recommend 
that once a desirable spike, or panicle, or cob is identified, only half of 
it is actually collected to maintain the desirable genes in the population 
which will continue to evolve (see figure 36). In the case of legumes, 
only a few pods should be collected from the selected plants. 

Stratified selection: in order to minimize environmental biases, 
spike selection should be done by considering the field as a grid of 
a number of quadrats and by selecting the best spikes from within 
each quadrat. In this way, the danger of selecting all the spikes from 
that part of the plot/field that has the best environmental conditions 
will be avoided. 

On the other hand, the best spikes, or panicles, cobs, or pods (hereaf- 
ter we will use only spikes as an example) in the worst part of the field 
may indicate the presence of plants able to grow well under difficult 
conditions; these plants would be missed should the selection be con- 
ducted only in the best-looking portion of the field. 

The size and number of quadrats depends on the variability of the 
soil and other environmental conditions of the plot of land, according 
to the farmers’ knowledge: in general, the higher the variability in the 
field, the higher the number of quadrats and the smaller the size of each 
quadrant should be. 

The number of spikes, to be collected from each quadrat depends on 
the size of the quadrat and the amount of seed that the farmer wants to 
have in order to start a new sub-population. The following section pro- 
vides guidelines on this issue. 

The spikes selected in any one year can be used in one of the follow- 
ing ways: 

1. The selected spikes are threshed individually and the seed of 
each spike will be planted separately in a row. This can be done by 
the researchers on station or preferably by the farmer. If, because of 
technical problems the head-rows are planted on station, they should 
be only multiplied, and selection should be delayed till the follow- 
ing year; 

2. If the head-rows are planted by farmers, they should be planted un- 
der the same stressful conditions (for example less irrigation) in which 
they were selected in order to continue the selection; 

3. The seed collected on the selected rows can be handled in three 
different ways (figure 36 paths 3, 4, and 5). 


How to Use the Evolutionary Populations 91 


Base Population 


To ee 


Population Year 1 Spike or plant selection 


@ y 


Population Year 2 One Spike = One row All spikes mixed 


Population Year 3 Individual selected Mixed selected 
| rows rows 


Population Year 4 . “ - 
PPB trials Sub population Sub Population 


(steps 2 to 5 of the processes above 
can be repeated every 3-4 years) 


v 


Population Year n 


Figure 36. Handling of an evolutionary population in a self-pollinated cereal crop: path | has 
been described in figure 33. Path 2 is single spike, selection which can then be followed by 
head-rows (paths 3 and 4) or by the creation of sub-populations (path 4 and 5). With most leg- 
umes, the same scheme can be applied using the pods of the same plant as the unit of selection. 


9.2.1.2. Spike selection to feed a PPB program (figure 36, paths 2 and 3) 

The seed harvested on the selected head-rows can be planted in small 
plots and further selected as shown in path 3. The seed produced on the 
selected small plots should be in sufficient amount to establish a Stage 
1 PPB trial as described in the PPB manual (Ceccarelli 2012): each se- 
lected row will be an individual entry in Stage 1 trial. Paths 2 and 3 can 
be repeated every year leading to a situation where in the village or in 
the research station there will be short rows and small plots, and in the 
village, PPB trials of the various stages. 

This is how evolutionary populations can feed a participatory breed- 
ing program with participation at two levels, firstly with farmer selec- 
tion of spikes from within the evolutionary populations and secondly 
with the evaluation by farmers of the plots derived from those spikes in 
the various stages of the PPB program. 

If this is considered difficult to manage, a new cycle of spike selec- 
tion can be initiated when the material derived from the previous cycle 
has reached Stage 4. 
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9.2.1.3. Spike selection to create sub-populations (figure 36 paths 2 
and 4 or 2 and 5) 

Rather than being planted again as separate rows for another cycle 
of selection, the seed of the selected rows can be mixed to establish a 
sub population. Alternatively, the individual spikes, instead of being 
kept separate and planted as separate rows, can be mixed after harvest- 
ing and threshed together, and the resulting seed can be planted as a 
sub-population. 

The two sub-populations resulting from paths 2 and 4 and 2 and 5 
represent a way to accelerate the process of adaptation — the first is 
expected to be more efficient because the selection is based on families 
rather than on individuals. The (improved) subpopulation could even- 
tually become the farmers’ crop. However, it is advisable to maintain a 
sufficiently large area of the base population which has a much wider 
genetic base and therefore a higher evolutionary potential. 


9.2.2. Selection within the evolutionary populations of cross- 
pollinated crops 


Cross-pollinated crops are expected to evolve much faster than 
self-pollinated crops and in principle they are the ideal crops for EPB. 

In a cross-pollinated crop such as maize, selection can be very 
effective and can be done in two steps. Firstly, before the male in- 
florescence is mature and produces pollen, it is advisable to discard 
undesirable plants (affected by diseases or insects, susceptible to 
lodging, etc.) by “detasselling”, namely cutting off the male inflo- 
rescence. In this way the undesirable plants will not be able to pass 
their genes to the following generation. Secondly, at maturity, one 
could select between the plants which were not detasseled, those 
with the best plant and cob characteristics. A final cob selection can 
be done after harvest, based on the kernels’ characteristics once the 
cobs are completely dry. 

As indicated earlier for the self-pollinated crops, it is advisable that 
each selected cob is roughly divided in half. One half will be used 
with all the other selected halves, to assemble an “improved popu- 
lation”, while the other halves will be mixed with all the other cobs 
harvested on the plants in the field to re-assemble the EP and let it 
continue evolving. 
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In maintaining EPs of cross-pollinated crops, isolation from unde- 
sirable or uncontrolled sources of pollen is important. Isolation can be 
obtained by distance or by living barriers such as trees or rows of a 
different crop. 


9.2.3. Selection within evolutionary populations of clonally 
propagated crops 


Several crops are clonally (or vegetatively) propagated. These in- 
clude all important root and tuber crops, nearly all type of fruits and 
forest trees. 

In breeding clonally propagated crops, methods used in cross-polli- 
nated and in self-pollinated crops can be very useful. The general prin- 
ciple in breeding these crops, which is important for the development 
and management of EPs, is a crossing phase to produce sexual seed 
and new genetic variation from which to select new clones (Griineberg 
et al. 2009). Therefore, the crossing phase corresponds to the creation 
of the EP, which can then be exploited by vegetative propagation. The 
peculiarity of clonally propagated crops, or crops which can be clonally 
propagated, is that each plant selected out of the products of the cross- 
ing phase is potentially a new variety. 


9.3. Diversity in the field, uniformity in the market 


The experiment on rice by Zhu et al. (2000) mentioned on pg. 17, 
illustrates an example of how to grow diversity in the field and sell 
uniformity in the market in those cases and for those crops in which 
uniformity is a market requirement. 

In the case of rice, the experiment was conducted to test the advan- 
tage of growing alternate rows of varieties susceptible and resistant to 
rice blast in reducing the incidence and severity of blast (figure 37), but 
at the same time it makes the point that, because the two varieties were 
not physically mixed, it was still possible to harvest them separately 
and therefore bring to the market the grain of the two varieties. 
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Figure 37. A mixture of two rice varieties obtained by alternating two rows of one variety and 
one row of the other variety (redrawn from Li et al. 2009). 


In the case of rice, there are examples of farmers who have been 
experimenting with mixtures of two varieties with different physical 
characteristics of the grain. On the one hand, a considerable reduction 
of blast was obtained while, on the other, the grain of the two varieties 
could be separated easily after harvesting. 

The concept of growing diversity in the field has been extended to 
different crops by Li et al. (2009) in an intercropping experiment with 6 
crops (tobacco, maize, sugarcane, potato, wheat, and broad bean) in four 
combinations, namely tobacco-maize, sugarcane-maize, potato-maize, 
and wheat-broad bean (figure 38). The four crop combinations were 
compared with the respective monocrop in adjacent plots in collabo- 
ration with farmers in the Yunnan province on 15,302 hectares in two 
years. Some combinations increase yield by between 33.2 and 84.7%. 

In these experiments, even when yield in intercropping were com- 
parable with those in monoculture, disease severity was decreased: for 
example, in the tobacco-maize intercropping, the severity of maize leaf 
blight was reduced by 17.0 and 19.7% while with the sugarcane-maize 
intercropping, it was reduced by 55.9 and 49.6% over two years. The 
same concept can be easily applied to fruit trees. 
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Figure 38. Examples of intercropping: each symbol represents a plant (hill) of a different crop: 
tobacco (©), maize (X), sugarcane (@), potato (0), wheat (A), broad bean (G) (redrawn from Li 
et al. 2009). 


9.4. Mixtures that should not evolve (static mixtures) 


There are particular situations in which it may be convenient to 
re-constitute the mixture every year. This is the case when, for exam- 
ple, a mixture of n varieties each in a given percentage of bread wheat, 
is found to produce a commercially successful product (i.e., bread or 
biscuits). If the success is associated to the specific proportions of the 
n varieties, there will be no advantage in letting the mixture evolve be- 
cause this might mean losing the specific characteristics of the mixture. 
In such a case, all that is needed is to grow the n varieties separately and 
to re-constitute the mixture every year before planting. 

Another, more complex example is a mixture of two different crops, 
barley and wheat, common in some areas of Ethiopia and Eritrea and 
known as hanfets. Farmers usually mix 60-65% of wheat, which till- 
ers less than barley, and 35-40% of barley. One of the reasons farmers 
grow the mixture is to stabilize yields across seasons because barley is 
more drought tolerant than wheat, while wheat is more productive than 
barley in wet years. Therefore, it is not convenient to let the mixture 
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evolve because, for example, after a dry year the composition of the 
mixture will be modified in favour of barley: if part of the harvested 
seed is planted in the following year and the season turns out to be wet, 
the amount of wheat left in the hanfets may be less than the optimum to 
fully exploit the wet season. 

Research conducted in Eritrea for three cropping seasons has con- 
firmed that hanfets yield more that the wheat component and is more 
stable than the pure wheat and barley (Woldeamlak et al. 2008). 


9.5. Evolutionary populations and seed systems 


We mentioned earlier that one of the advantages of cultivating EPs is 
that they bring back seed production into farmers’ hands. This is because, 
for biological reasons, there cannot be better seed than the one produced 
by a population, which, through evolution, becomes better and better 
adapted to the physical and agronomic conditions where it evolves. 

This means that it is difficult to reconcile the formal seed system, 
typically centralized, as described by Coomes et al. (2015) with such a 
biodiverse agricultural landscape as the one generated by a generalized 
use of EPs by farmers. For this reason, with the introduction of EPs in 
any country it is necessary to develop a different type of seed system 
with characteristics common to the informal seed system but with the 
ability to accommodate the evolving properties of the EPs. 

For those crops for which the EPs cannot be used as the “farmer’s 
crop”, but need to be used as a source population, a different type of 
plant breeding program is needed. 

Based on the experience in participatory plant breeding, the ideal 
breeding strategy to fully exploit the potential of EPs and mixtures is 
a decentralized-participatory model supported by a “farmers-managed 
diffuse seed system” which should be organized within each target en- 
vironment (TE) (figure 39) in the form of cooperatives or associations. 

Once established, one or more EPs, one of each crop, is distributed 
to a number of farms within a given TE. Not all EPs will be distributed 
to all farms representing as much as possible the diversity with a given 
TE, both climatically and agronomically, and the target clients. The EPs 
should be grown as it is usually done for that crop with the practices — 
including organic practices and rotations — used in that farm. 
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The EPs should be planted using the seed which is harvested in the same 
farm. This implies that the farm producing EP seed needs to have facilities 
for cleaning, storing, and packing the seed. At each cropping season, date 
of planting, seed rate, and agronomic management should be recorded in 
detail. Collecting climatic data (daily maxT, minT, and rainfall) is highly 
desirable to understand the dynamic (extension and speed) of the evolu- 
tionary changes in the EPs. Also, at each cropping season and as described 
earlier, it is highly recommended that the amount of seed corresponding to 
about 30,000 plants in the case of cereal field crops is safely stored. 
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Figure 39. A general model of a Decentralized Participatory-Evolutionary Plant Breeding based 
on evolutionary populations or dynamic mixtures supporting a farmers managed diffuse seed 
systems. The grey circles represent the selection sites within a given target environment (TE), 
here ten as example, which may correspond to a region or a province. The arrows indicate both 
the flow of material and of information (modified from Ceccarelli and Grando 2020b). 


Depending on the crop, the market, and the genetic structure of the fu- 
ture variety, the farmer, or the farmer and the breeder together, can use all 
or part of the EP for selection using one of the methods described earlier. 
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Figure 40 shows the details of the selection process in the case of 
a single EP within each organic farm or community of organic farms. 
While the EP evolves over time (A), selection is being practiced by 
farmers (B). Selection can be done in different ways depending on the 
crop and the type of variety the farm or the community is aiming at. 

For example, it is possible to select uniform varieties for some 
types of markets, uses, or seed systems, and select heterogeneous ma- 
terial for other types of markets, uses, and seed systems. Participation 
includes men and women farmers, but can include other stakeholders 
depending on the crop, for example bakers and millers in the case 
of cereals. It is therefore recommended that participants’ opinions be 
disaggregated by gender and/or by the profession of participants as 
this will allow a more precise targeting of the final product to poten- 
tial clients and/or value chains. 

Participation continues (C) during the development of varieties from 
the initial selection through Multi Environment Trials (MET) conduct- 
ed on neighbouring farms or on different farms of the community. 
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Figure 40. The participatory selection within an evolutionary population within a single organic 
farm or within a community of farms representing a TE using the EP as source population for 
the PPB program (Ceccarelli and Grando 2020b). 
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The model shown in figure 40 can be replicated in every TE. The 
use of evolutionary plant breeding as a breeding method for small 
cereals under organic conditions has also been discussed by Murphy 
et al. (2005). 

A “farmers’-managed diffuse seed system” is, by its nature, loca- 
tion-specific and ideally an integral part of a value chain where the seed 
is at the starting point and the grain (to be used as such or to be trans- 
formed) is at the end. 

Such a model can then be replicated multiple times in each TE, which 
depends on the extension of the cultivation area of a given crop. 


9.6. Current use of evolutionary populations 


To the best of our knowledge, the two countries where evolutionary 
populations have spread widely becoming the farmers’ crops are Iran 
and Italy. 

The two authors constituted three EPs while working as plant breed- 
ers at the International Center for Agricultural Research in the Dry Ar- 
eas (ICARDA), then in Aleppo (Syria). The three populations are now 
grown widely in Italy and to a lesser extent in other countries, while one 
of them is grown in Iran together with locally developed EPs. 

In October 2008, we mixed the seeds of nearly 1,600 F, of barley 
representing all the crosses we made two years earlier between varieties 
from all over the world including old local varieties and even the wild 
progenitor of barley. We thus obtained about 160 kg of seed, some of 
which was sent to five farmers in Syria and to our partners in Algeria, 
Eritrea, Jordan, and Iran, recommending that in each country the seed 
be divided between five different farmers. The aim was to allow farm- 
ers in some of the countries that had collaborated in the PPB programs 
to use the skills they had refined during those programs to independent- 
ly manage the diversity present within EPs. 

In Iran, a local scientist was interested in the large diversity showed 
by the barley EP and decided to make a bread wheat EP with his own 
breeding material. The Iranian farmers were so satisfied with the pop- 
ulation’s performance that they shared the seed of the barley EP with 
other farmers in other provinces, both through the International NGO 
CENESTA’s PPB program and informally with neighbours, friends, 
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and relatives. The work with EPs in Iran was supported by a small grant 
from the International Fund for Agricultural Development (IFAD) and 
within few years, from 2010 to 2014, the populations reached several 
hundred hectares in 17 provinces by about 150 farmers (figure 41) and 
continue to spread. 
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Figure 41. Distribution of wheat, barley and rice evolutionary populations in Iran (Ceccarelli 
et al. 2022) 


The spreading of the participatory breeding program first and of the 
EPs later, was greatly facilitated by the former Director of the Provin- 
cial Agricultural Office of Kermanshah province (figure 42). With his 
support, EPs had an almost immediate commercial success and the pop- 
ulations spread to different provinces. 
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Figure 42. An evolutionary population of bread wheat in Ravansar, Kermanshah province, Iran. 


At the same time, the barley EP spread among the nomads who used 
it successfully as animal feed. 

In 2013, as a collaboration between CENESTA and the Iranian Rice 
Research Institute, different mixtures of Iranian rice landraces were de- 
veloped. 

At ICARDA in 2009, following the barley EP made in 2008, we made 
an EP of durum wheat by mixing the F, seed of a little more than 700 du- 
rum wheat crosses and one of bread wheat by mixing F,, F,, and F, seeds 
of nearly 2,000 bread wheat crosses (figure 43). These two populations 
also harboured a great deal of diversity, not only because of the high num- 
ber of crosses on which they were based, but also because, as in the case of 
the barley EP, the varieties used for crosses came from all over the world 
and included wild relatives. The two wheat EPs were sent to Morocco, 
Algeria, and Jordan, as well as distributed to some farmers in Syria. 
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In 2010, with the collaboration of the Italian Association of Organic 
Agriculture (AIAB), within the framework of a European project, the 
seed of the three ICARDA EPs were sent to Italy and then spread from a 
few initial farmers to several farmers in almost every region in Italy by 
informal seed exchanges; the majority are now using the EPs directly to 
produce pasta or bread, while at the same time saving part of the seed 
for the following cropping season. 


Figure 43. The ICARDA evolutionary population of bread wheat in Central Italy. 


The Italian experience with EPs has shown once again their ex- 
traordinary ability of evolving differently in different locations, gen- 
erating different populations that can be easily recognized as shown 
in figure 44. 
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Figure 44. An example of divergent natural selection of an evolutionary bread wheat popu- 
lation: the ICARDA evolutionary bread wheat population after 10 years evolution in Sicily 
(left) and the same population after 10 years evolution in Tuscany (right) grown side by side in 
Marche in 2020 (courtesy of Pierluigi Valenti, Rocca Madre Cooperative). 


The ICARDA EP of bread wheat was cultivated for the first time in 
2010 by two farmers located in central Italy (Tuscany) and in south 
Italy (Sicily), respectively. The two farmers planted the respective EP 
ever since (and still do at the time of this writing) using their own seed. 
In 2019, seed of the two EPs was purchased by farmers in central Italy 
on the Adriatic coast (Marche) and was planted side by side. As shown 
in figure 44, the EP, which evolved in Tuscany (right) was later, taller, 
greener and more lodging susceptible while the EP, which evolved in 
Sicily (left) was earlier, shorter, pale green, and more lodging resistant. 

This suggested farmers to experiment with mixture of the two EPs 
and the first results seem to indicate that, as expected, the mixture of 
the two EPs, called the Piceno EP, from the name of the region in the 
ancient, pre-Roman Empire time, is more lodging resistant than the EP 
from Tuscany. 
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9.7. The agronomic management of evolutionary populations 


A question often asked by farmers is “how do we grow EPs and mix- 
tures?” and the obvious answer, as we mentioned earlier, is “like any 
other variety of that particular crop”. 

However, some of the typical characteristics of EPs and mixtures 
such as asynchronous flowering and maturity, while having all the ad- 
vantages described earlier, may pose problems at harvesting, particular- 
ly in cereal crops, because while waiting for the late types to mature, the 
early types may shatter the seed. While farmers with a long experience 
in growing EPs of cereals do not seem to perceive this as a problem, we 
can look at this issue from two angles. If harvesting is done by combine, 
as often is the case in most developed countries, and the farmer relies 
on contractors, harvesting is seldom timely done and often is done later 
than the optimal time. This causes yield losses due to seed shattering of 
the early types. On the one hand, this represents a selection against shat- 
tering and therefore with time the yield losses should decrease. On the 
other hand, this also represent the loss of early maturing genotypes in 
the population. As shown in the experiments of Nevo et al. (2012) and 
Vigouroux et al. (2011), natural selection in areas affected by drought 
favours early heading genotypes. Therefore, in drought prone areas, 
which are likely to expand with climate change, it may be advisable 
to reduce the loss of the early maturing genotypes in the populations 
by following one of the methods of selection within EPs and mixtures 
described in section 9.2. 

In the case of horticultural crops, staggered maturity can become an 
advantage when farmers sell their products directly on their property 
or in local markets. In fact, in the case of uniform varieties, particular- 
ly of horticultural crops, farmers are obliged to either develop storage 
facilities or to sell through the large-scale retail organizations with a 
consequent loss of profit. 

An issue to consider in the case of EPs and mixtures of cereal crops 
such as wheat, barley, and rice is the optimum plant density (namely 
seed rate). The issue is well known in crops where the varieties are 
made of genetically identical plants that therefore have virtually the 
same fitness. In this case, yield increases with plant density until reach- 
ing a plateau. This is because at low seed rates (low plant density) there 
is no competition between plants, and yield is limited by the number of 
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plants per unit area. As the number of plants per unit area increases, so 
does the competition between plants with a corresponding decrease of 
the yield per plant. With the increase in plant density, a point is reached 
where there is a balance between the increase in the number of plants/ 
unit area and the decrease in individual plant yield with no further in- 
crease in the overall yield. EPs and mixtures bring drastic changes in 
the dynamics of plant-to-plant interactions within the crop and therefore 
the issue of plant density should be of particular interest because it af- 
fects the balance between the competitive and facilitative interactions. 
However, the effect of plant density is a complex issue in the case of 
EPs and mixtures because of the ever-changing portfolio of genotypes 
they contain. 

In practical terms, and in the case of the wheat EPs grown in Italy, 
farmers are impressed by their high weed-suppressive ability, which in 
some cases has been improved by replacing drill planting with broad- 
cast planting, which can be done mechanically with a fertilizer spreader 
or with a modified drill. 


9.8. Seed production of evolutionary populations 


As mentioned earlier, a property of EPs and mixtures is their ability 
to evolve both in time and space, and therefore the farmers who culti- 
vate them have every interest in producing and using their own seeds. 

To explain the issues at stake, we should remember that in some 
crops, for example cereal crops, the kernel performs two functions and, 
depending on the function, it assumes two different names: when it per- 
forms the function of propagation, we call it seed, and when it performs 
the function of the starting point of a product (for example bread) we 
call it grain. 

Where harvesting is fully mechanized and it is done by contractors 
who are serving several farms, this might create problems of contam1- 
nation by mixing seed of different crops. If the seeds are very different 
as in the case of legumes and cereals, they can be easily separated after 
harvesting, but if the same combine harvests a field of a bread wheat EP 
or mixture after harvesting a barley field, or a field of a durum wheat EP 
or mixture after harvesting a field of bread wheat, the separation between 
the kernels of the two crops becomes more difficult, if not impossible. 
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The same problem may occur within the same crop in the case of EPs 
and mixtures differing for seed colour (i.e., black-seeded vs white-seed- 
ed barley, or kabuli vs desi chickpea), or for spike morphology (two- 
roW Vs six-row barley) or for the aspects of the kernels after threshing 
(naked vs hulled barley). 

Commercial combines cannot be completely cleaned when they move 
from one crop to another and therefore, if part of the grain which is har- 
vested is used as seed for the following cropping season, and it contains 
some grains of other species, for example barley grains in a bread wheat 
population, or bread wheat grains in a durum wheat population, due 
to the different competitive capacity of the three crops (barley is more 
competitive than bread wheat and bread wheat is more competitive than 
durum wheat), over time the population is no longer a bread wheat or 
a durum wheat EP, creating problems in processing and marketing the 
products derived from them. 

To avoid these problems, the seed to be used for sowing CANNOT 
come from the same field used for grain production, that is, seed and 
grain must come from different fields, managed from sowing to har- 
vesting in different ways. 


This solution has three implications. 


The first implication is that the crops planted for seed production 
should be drilled using a distance between rows double the norm (figure 
45 left). This will allow an accurate inspection of the field after heading 
and before harvesting to rogue all the plants belonging to other species 
without doing any damage to the crop planted for seed production. One 
possible drawback is that the extra space between rows may increase 
the presence of weeds. One alternative is to leave an alley of about 1m 
between strips of a commercial drill (figure 45 right). 

The second implication concerns the crop rotation: from our initial 
experience in Central Italy with wheat EPs, the fields chosen for seed 
production sown after a 34-year alfalfa crop are weed free and there- 
fore ideal for either planting solutions discussed above. This is impor- 
tant for the third implication. 
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Figure 45. Two fields for the production of seed of an evolutionary ICARDA population of 
bread wheat: at the left sown in alternate rows to allow the uprooting of any barley or durum 
wheat plant (courtesy of D. Perozzi). At the right, a field of an evolutionary ICARDA durum 
wheat population with an alley between each pass of a 3-meter drill. 


The third implication is that, if the separation between seed produc- 
tion and grain production were to be made by each individual farmer, 
part of the land should be dedicated to the production of seed to be 
used on farm. This might be difficult in the case of small farms and also 
involves two elements of risk. The first is linked to adverse climatic 
events that could leave the farmer without seeds or with an insufficient 
quantity of seed. The second is of a genetic nature and is associated 
with a possible continuous reduction of genetic diversity due to the pro- 
cess of adaptation to the specific microenvironment represented by a 
single farm. 
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A consequence of this third implication that we have begun discuss- 
ing with farmers is to consider the seed production as a responsibility to 
be carried out on a rotational basis by a community of farmers in a rel- 
atively homogeneous area as a service to the entire farmers’ community 
who in that area cultivate one or more EPs or mixtures. The area could 
well correspond to a TE as described at page 96. This would also allow 
inserting the field dedicated to seed production of the EP of a crop not 
only in an appropriate rotation but also reducing the risk of unwanted 
and uncontrolled mixing at the time of harvesting by subcontractors. 

All this could become easier to manage if the community became in- 
dependent by equipping itself with a combine dedicated exclusively to 
the harvesting of the fields planted for the production of seed of the EPs 
and mixtures. The addition of a small thresher would allow threshing 
the spikes selected from the EPs and mixtures. 

Each farmer who participates in the process can reproduce, in rota- 
tion with other farmers, the seed following the practices described. The 
quantities produced will be exchanged within the community and it is 
precisely this practice of exchange that contributes to supporting farm- 
ers’ knowledge and encouraging the spread of EPs and mixtures. 

Seed production of EPs and mixtures of a number of horticultural 
crops, such as tomatoes, zucchini, eggplants, etc., where the seed is part 
of the fruit, which represents the commercial product, does not present 
the problems described for cereals. However, main issue in this case is 
sampling. When a single fruit, or few of them, can provide all the seed 
needed to establish the production field for the next cropping season, 
this creates the risk of not capturing the entire diversity available within 
the EP or the mixture. This is a serious problem because evolutionary 
theory suggests that the ability of populations to become locally adapt- 
ed depends on their genetic diversity. 

An extremely important aspect of the practice of exchanging seeds 
and, in general, of regaining control of seed production by the farmers, 
is the control of diseases, particularly the seed-borne diseases. The risk 
of spreading diseases together with the seed is always present and can 
compromise the credibility of the whole process particularly in organic 
agriculture where the options for seed treatment are limited. This risk, 
too, can be greatly reduced by working as a community thus fostering 
the exchange of knowledge on how to produce healthy seed and how to 
store both seed and grain under optimum conditions. 


10. 
EXPERIMENTAL DESIGNS 
AND STATISTICAL ANALYSIS 


10.1. Why experimental designs and statistical analysis 


In plant breeding, whether conventional, molecular, participatory or 
evolutionary, we are dealing with differences: during selection we need 
to make sure that what we select is different (better) from what we al- 
ready have; in studying farmers’ preferences we need to know, for ex- 
ample, whether those of men are different or not from those of women. 
Also, in agronomy we are dealing with differences between different 
treatments, such as fertilizer application, weed control etc., or treatment 
combinations. 

What causes the differences between preferences, populations, vari- 
eties or treatments as we observe them in the field? One cause could be 
that the two varieties, or the two EPs, or the two agronomic treatments 
are truly different. However, they also may look different because the 
soil on which one variety grows or one treatment has been applied is 
different (more fertile, deeper, better structured etc.) than the soil where 
the other variety grows or the other treatment has been applied. In this 
case the two varieties, or the two treatments are actually NOT different, 
even though they look different. More generally, two varieties or two 
treatments which ARE NOT different, may look different for reasons 
that we ignore, or for reasons that are not controlled by the experiment. 
Similarly, two varieties or two treatments which ARE different, may 
look similar for the same reasons. 
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Figure 46. The Normal Distribution. The extreme left and right are called tails. 


Experimental designs and statistical analysis help us in discovering (in 
probabilistic terms) what is the likely cause of an observed difference. 
This is possible only if the data follow a Normal Distribution (figure 46) 
because in this case we can use the properties of the Normal Distribution 
to calculate the probability that a given difference is only due to chance 
and therefore is not real (this is also called null hypothesis). 

The Normal Distribution shows that, given that there are no differences, 
as for example when tossing a coin (in this case we expect heads or tails 
with the same probability of 50%), the probability of actually observing, 
only by chance, a small difference (for example 40% heads and 60% tails) 
is much higher, as shown by the large central area of the distribution, than 
to observe a large difference (for example 90% heads and only 10% tails), 
as shown by the end of the distribution which represents a much smaller 
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area: one end will have the probability of 90% heads and only 10% tails 
and the opposite end the probability of 10% heads and 90% tails. 

There are two main properties of a Normal Distribution. The first is 
the mean (1), which, in the case of figure 46, is zero because it is the 
mean of the differences between the number of tails and the number of 
heads. The second is the standard deviation (o), which is the distance 
(deviation) from the mean (therefore a linear measure). The Normal 
Distribution tells us what is the exact percent of the total area under the 
curve between the mean and + a given standard deviation. For example, 
the area between the mean + | standard deviation is 68.26%. Similar- 
ly, the area between the mean + 2 standard deviations is 95.45%: this 
means that, if there are no real differences, there is slightly less than 5% 
probability (100 — 95.45 = 4.55%) to find a value higher than the mean 
+ 2 standard deviations or lower than the mean — 2 standard deviations. 
To be more accurate, in the absence of real differences, there is a 5% 
probability to find a value higher than the mean + 1.96 standard devi- 
ations or lower than the mean — 1.96 standard deviations, or there is a 
1% probability to find a value higher than the mean + 2.576 standard 
deviations or lower than the mean — 2.576 standard deviations. 

In other words, even when comparing the same wheat variety on two 
different plots, we can still obtain experimental data such as plant height, 
spike length, kernel weight, and grain yield, which are different; the Nor- 
mal Distribution not only tells us that the bigger the difference the lower 
its probability, but allows us to determine exactly what that probability is. 

The values shown above are found in statistical tables such as the one 
shown on pg. 121, corresponding to the respective probability levels (P 
levels). This is exactly what a statistical analysis does, given that we 
have chosen the appropriate experimental design and we have done the 
correct randomization: in other words, it will tell us the probability that 
a given difference is due to chance. When that probability is less than 
5% (P<0.05), or less than 1% (P<0.01), we conclude that the difference 
is significant or highly significant. But we should always remember 
that there is always a probability (less than 5% or less than 1%) that our 
conclusion is wrong, and that there are in fact no differences. If this is 
indeed the case, we would have made what is known as a Type | error, 
i.e., rejecting the null hypothesis when the null hypothesis is true (table 
4). On the other hand, if we do not reject the null hypothesis when the 
null hypothesis is false, we make what is known as a Type 2 error. 
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Table 4. Type | and Type 2 error in experimental statistics. 


Decision (based on a sample) Null hypothesis is true Null hypothesis id false 
Reject null hypothesis Type | Error Correct decision 
Null Hypothesis not rejected Correct decision Type 2 Error 


Figure 46 and the associated probabilities of given deviations from 
the mean refers to an ideal (infinite) population. In practice we al- 
ways experiment with a finite number of items (varieties, populations, 
mixtures, agronomic treatments, etc.) that we assume to be a sample 
randomly extracted from an infinite population normally distributed. 


10.2. Confidence interval 


In statistics, the confidence interval (CI) is a type of interval estimate, 
computed from the statistics of the observed data, which might contain 
the true value of an unknown population parameter within a given prob- 
ability. The interval has an associated confidence level that quantifies 
the level of confidence that the parameter is captured by the interval. 
More strictly speaking, the confidence level represents the frequency 
(i.e., the proportion) of possible confidence intervals that contain the 
true value of the unknown population parameter. In other words, if con- 
fidence intervals are constructed using a given confidence level from an 
infinite number of independent sample statistics, the proportion of those 
intervals that contain the true value of the parameter will be equal to the 
confidence level. 

Confidence intervals consist of a range of potential values of the un- 
known population parameter. However, the interval computed from a 
particular sample does not necessarily include the true value of the pa- 
rameter. Based on the (usually taken) assumption that observed data 
are random samples from a true population, the confidence interval ob- 
tained from the data is also random. 

The confidence level is designated prior to examining the data. Most 
commonly, the 95% confidence level is used. However, other confi- 
dence levels can be used, for example, 90% and 99%. 

The calculation of the CI will be shown in one of the following sections. 
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10.3. Not only p-values? 


There is currently a scientific debate — somebody called it a “statistics 
war” (Mayo 2018) — on how much importance one should give to the 
probability level as the sole arbiter to decide statistical significance and 
hence accept or reject the null hypothesis (see for example Amrhein 
et al. 2019; Wasserstein et al. 2019). One example of a more extreme 
position, shared by several statisticians, is the following list of what not 
to do (Wasserstein et al. 2019): 

— Do not base your conclusions only on whether an association or an 

effect are “statistically significant” (namely P<0.05, for example); 

— Do not believe that an association or an effect exists just because it 
was “statistically significant”; 

— Do not believe that an association or an effect is absent just because 
it was not “statistically significant”; 

— Do not believe that your p-value gives the probability that chance 
alone produced the observed association or effect or that the 
probability that your test hypothesis is true (see also below on this issue); 

— Do not conclude anything about scientific or practical importance 
based on statistical significance (or lack of it). 

In other words, the p-value does not tell us anything about the hy- 
pothesis, but it tells us something about our data: if we repeat the anal- 
ysis several times using new data each time, and if the null hypothesis 
were really true, at p = 0.05, on only 5% of those occasions we would 
reject the null hypothesis and be wrong (Dirnagl 2019). 

The American Statistical Association addressed this issue in a sym- 
posium held in October 2017 whose conclusions have been published 
in a special issue of The American Statistician. 

An interesting example of the danger of relying only on p-values 
is shown graphically in Figure 47 (Amrhein et al. 2019), which sum- 
marizes the results of two similar medical studies on the side effect 
(fibrillation) of an anti-inflammatory drug. One of the studies conclud- 
ed that the drug did not have significant side effects (red bar in figure 
47), although there was a mean of a 20% greater risk of side effects in 
treated patients than in untreated patients. The 95% confidence interval 
spanned from a 3% decreased risk to a considerable risk increase of 
48%. A previous study on the same side effect of the same drug has 
found the same average risk of 20%, which in that study was signifi- 
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cant. That study was simply more precise, with a confidence interval 
spanning from 9% to 33% greater risk of fibrillation. 


BEWARE OF FALSE CONCLUSIONS 


Studies currently dubbed ‘statistically significant’ and statistically non-significant need not 
to be contradictory, and such designations might cause genuine effects to be dismissed. 


‘Significant’ study 
(low P value) 


‘Non-significant’ stud 
(high P value) 


The observed effect (or 
point estimate) is the same 
in both studies, so they are 
not in conflict, even if one 
is significant and the other 
is not. 


Decreased effect 4" No effect Increased effect 


Figure 47. An example of similar results with different conclusions concerning significance 
(redrawn from Amrhein et al. 2019). 


In the first study, it was absurd to conclude that the statistically 
non-significant results showed “no association” between the drug and 
fibrillation when the interval estimate included serious risk increases; 
it is equally absurd to claim that these results were in contrast with the 
previous study showing an identical observed effect. 

This is an example of what the authors should have written: “Like ina 
previous study, our results suggest a 20% increase in risk of fibrillation 
in patients receiving the anti-inflammatory drug. Nonetheless, a risk 
difference ranging from a 3% decrease, a small negative association, 
to a 48% increase, a substantial positive association, is also reasonably 
compatible with our data, given our assumptions”. 

The controversy about the usefulness of p-values has been enriched 
by interesting stories such as the following. “For a brief moment in 2010, 
Matt Motyl [a psychology PhD student at the University of Virginia, 
Charlottesville] was on the brink of scientific glory.... Data from a study 
of nearly 2,000 people seemed to show that political moderates saw 
shades of grey more accurately than did either left-wing or right-wing 
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extremists....The P value....was 0.01 — usually interpreted as ‘very sig- 
nificant’. Publication in a high-impact journal seemed within Motyl’s 
grasp. But then reality intervened. Sensitive to controversies over repro- 
ducibility, Motyl and his adviser, Brian Nosek, decided to replicate the 
study. With extra data, the P value came out as 0.59 — not even close to 
the conventional level of significance, 0.05. The effect had disappeared, 
and with it, Motyl’s dreams of youthful fame” (Nuzzo 2014). 

As mentioned earlier, not everybody agrees: “I disagree that testing 
an association for statistical significance should be banned. We may 
just as well argue in favor of banning exams” (Adams 2019). Similarly, 
Zhang (2019) claims that “the concept of statistical significance is anal- 
ogous to ‘beyond reasonable doubt’ in the judicial system — it reflects 
the uncertainty in data that people are prepared. In my view, banning 
the use of statistical significance would be impractical. Instead, we need 
to educate scientists in the proper usage of the term.” 

Nature, one of the leading scientific journals, is not seeking to change 
how it considers statistical analysis in the evaluation of papers, but it 
encourages readers to exchange their views (Editorial 2019). 


10.4. Parameters and estimates: mean, standard deviation and 
degrees of freedom 


The properties of an ideal (infinite) population, namely the mean () 
and the standard deviation (0), are indicated with Greek letters and are 
called parameters. As we cannot measure all the individuals of an in- 
finite population, we cannot calculate yp and o, but by measuring the 
individuals of a sample taken at random from a population, we can 
estimate them. The estimates are indicated with the Latin letters corre- 
sponding to the Greek letters for the parameters. 

The estimate of 1 is the mean of the sample (xX) and is calculated as 
xx/n. For example, if we have 


15, 20, 17, 40 
The sum is &x = 92 and the mean is x = 92/4 = 23. 


An important property of the mean is that the sum of the deviations 
of each data from the mean is zero. In fact: 
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15-23 =-8 
20 - 23 = -3 
17 - 23 = -6 
40 - 23 =+17 


and (-8) + (-3) + (-6) + (+17) =0 

This property is important because it implies that once we know the 
first n-I deviations, the last one is automatically known; in other words, 
the information given by the last deviation from the mean does not add 
any additional information to the one given by the first n-/ deviations, 
or, in statistical terms, it is not independent. 

The estimate of o is the standard deviation of the sample (s) and is 
calculated as the square root of the variance o°, or s* (also called mean 
square or sample variance) as follows: 

© (x - X)/(n-1) = (x? - (x)-/n)(n-1) = s? and s = Vs? 

The term = (x - x) is the sum of the squared deviations from the 
mean (we need to square them to get rid of the negative signs), which 
is abbreviated as SS, while the term (>°x)’/n is called Correction Factor 
and abbreviated as CF. 

Using the same data as before: 

s* = (15 - 23)? + (20 - 23)? + (17 - 23)? + (40 - 23)*/n -1 = 398/3 = 
132.66 

and s = V132.66 = 11.52 

The denominator (7-/) used in the calculation of the variance is called 
degrees of freedom (df), which, for this parameter, is always the num- 
ber of data minus |. The reason for using n-/ instead of n, depends on 
the property of the mean illustrated earlier, namely that the sum of the 
deviations of each data from the mean is zero. Therefore, using again 
the same example: 


15-23 =-8 
20 - 23 = -3 
17 - 23 = -6 
40 - 23 =? 


Once we know n-/ (in our case 3) deviations, the last one MUST 
be +17 for the sum of the deviations to be zero. Therefore, since the 
calculation of the standard deviations is based, as the term indicates on 
deviations, we must use n-J/. 
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The concept of degrees of freedom is very important in designing an 
experiment as we will see below, but also in deciding how much confi- 
dence we can have in p-values and significance levels. 

For x and s measured in a sample to be reliable estimates of the pa- 
rameters pi and o of the population, the sample must be representative of 
the population, and to be representative, the sample must be a random 
sample. A random sample is one in which any individual of the refer- 
ence population has the same probability of being included. 


10.5. A simple experiment (the t test) 


We will use a simple experiment to clarify the properties of degrees 
of freedom which will be important in designing experiments. 

The obvious question of this experiment with 11 varieties was wheth- 
er, on average, height increased with fertilizer. In other words, is the 
mean difference of 41 significantly different from 0? Or, in terms of 
distribution, the two sets of 11 values (second and third column) are 
from the same or from two different distributions with two different u 
of which 97 and 56 are the respective estimates? 


Table 5. Height of 11 varieties with and without fertilizer. 


Variety with fertilizer (X,) without fertilizer (X,) Difference (X,-X,) 
1 57 89 -32 
2 120 30 90 
3 101 82 19 
4 137 50 87 
5 119 39 80 
6 117 22 95 
7 104 57 47 
8 73 32 41 
9 53 96 -43 
10 68 31 37 
11 118 88 30 
Total 1,067 616 451 
Mean 97 56 41 
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The ¢ test can answer this question in terms, as we said earlier, of 
probabilities. These are the calculations. 

1. Calculate the squares of the deviations from the mean in both se- 
ries: 

EK ao 20 T wcgtess + 118? = 111,971 

bOI a alae | ae eee ee 88? = 422,44 


2. Calculate the two correction factors: 
(£X,)°/11 = 1,0677/11 = 103,499 
(2X,)*/11 = 6167/11 = 34,496 


3. The squares of the deviations from the mean will be: 
x X,7 = 111,971- 103,499 = 8,472 

XX,” = 42,244 - 34,496 = 7,748 

Each with 11 -1 = 10 degrees of freedom 


4. The pooled variance s’ will be (8,472+7,748)/(10+10) = 811 with 
20 df 


5. The standard deviation of the difference between the two means 

Sq -x,) = V(2*s?)/n = V (2*811)/11 = 12.14 

The ¢ test consists in comparing the observed difference of 41 with 
its standard deviation 12.14 and it is the simple ratio between the two: 

t= (X,- x,)/s (x,- X,) > 41/12.14 = 3.38 with 20 df 

All these calculations are easily done with a computer or even within 


Excel by installing the “data analysis” as add on tool. 
The critical issue is what to do with this value of 3.38. 
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Before doing that, we provide the script to perform the calculations in R using the data of 
table 5 (in italics are our comments): 


a <- scan () the data of the first treatment are stored in the object a 

57 120 101 137 119 117 104 73 53 68 118 

b <- scan () the data of the second treatment are stored in the object b 
89 30 82 50 39 22 57 32 96 31 88 


boxplot (a, b) “This is not essential but it gives a graphical representation of the difference 
between the two means” 
t.test (a, b, var.equal=TRUE) 


Results 

data: A and B 

t = 3.3764, df = 20, p-value = 0.003 

alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval: 

15.66997 66.33003 

sample estimates: 

mean of x mean of y 


97 56 


To determine the probability that such a difference has been obtained 
only by chance and that there is actually no real difference due to fer- 
tilizer, we have to consult statistical tables, available at the end of all 
statistical books and also online (table 6). 

It is also possible to obtain statistical tables in R. 

For example, the function: 

pt(3.38,20) will give the probability of a ¢ = 3.38 with 20 degrees of 
freedom. 

For a two-tailed test the function becomes: 

2*(1-pt(3.38,20)) 

The three most used columns in table 6 are those indicated with a red 
arrow. They indicate that the probability of 2.5%, 0.5% and 0.05% of 
finding a value of ¢ for a given number of degrees of freedom is if we 
consider only one tail, and of 5.0%, 1.0% and 0.1% if we consider both 
tails, namely a positive or negative value of ¢. 
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At value of 3.38 with 20 degrees of freedom has a probability of oc- 
curring only by chance of less than 0.1% in the case of a two-tailed test 
or less than 0.05% in the case of a one-tailed test (the calculation with R 
gives us the exact probability, which is 0.003). In fact, the critical value 
for 20 degrees of freedom at these probability levels is 2.845 to increase 
to 3.552 to the next probability level. 


10.6. Calculation of the confidence interval 


The formula for the confidence interval is 

= =, x) Sts, &,) 

Where: 

— (X, - X,) is the observed difference between the means: in our case 41 

—t is the table t value for the level of confidence we choose and for 

the number of degrees of freedom: in our case with 20 degrees of 

freedom we have 2.086 for P=0.05 (95% confidence interval) and 

2.845 for P=0.01 (99% confidence interval) 

— sis the standard deviation of the difference: in our case 12.14 

41 + 2.086 * 12.14 = 66.324 

4] - 2.086 * 12.14 = 15.679 

These are the 95% confidence intervals 

41 + 2.845 * 12.14 =75.53 

4] - 2.845 * 12.14 = 6.46 

Therefore, regardless of whether we use a one-tailed or two-tailed 
test, there is a very small probability that such a ¢ value can be ob- 
tained only by chance. In this case we conclude that the difference is 
statistically significant, keeping in mind that there is always a proba- 
bility, although small, that such a conclusion is wrong. 

In particular, in the case of a two-tailed test, we say that a difference, 
as mentioned earlier, is significant when the probability that it occurs 
by chance is less than 5% and highly significant when the probability 
that it occurs by chance is less than 1% or 0.1%. It is always very im- 
portant to indicate the level of probability. 
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Table 6. The table of t. 


There are two interesting aspects of table 6. The first is that the last 
row (corresponding to z) shows the values of ¢ (the same applies to 
other statistical tests) corresponding to the reference populations, as 
discussed at pg. 111. 

The second interesting aspect of table 6 is that the critical value of t 
initially decreases rapidly with the increase of the degrees of freedom 
(df), regardless of the probability. For example, at p = 0.05 it goes from 
12.71 for 1 df, to 2.571 for 5 df, whereas after 20 degrees of freedom, 
the decrease is much slower. For example, at p = 0.05 it goes from 
2.086 for 20 df, to 2.060 for 25 df. Above 30 df, the decrease for every 
additional df is even slower. This means that with a small number of 
degrees of freedom such as in an experiment with few varieties or few 
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treatments, it is difficult to detect small differences. It also means that 
large experiments generating more than 30 df for the experimental er- 
ror (see below) are an inefficient use of resources. This is important to 
avoid designing unnecessarily large experiments. 


10.7. Correlation and regression 


In some of the experiments described in section 7 we have seen that 
the data analysis used two methods, correlation and regression, suited 
to study the relationships between variables, two in the case of correla- 
tion, two or more in the case of regression. 

The difference between the two methods is that in the case of corre- 
lation there are no reasons to think that one variable, for example grain 
yield measured at a research station, depends on another variable, for 
example grain yield measured in a farmer’s field, or vice versa. Simi- 
larly, using the example of the biplots that we have seen in some exper- 
iments, there are no reasons to think that the position of one location, 
or one location-year combination, depends on the position of another 
location-year combination. 

In these cases, the correlation is the suitable method to study the re- 
lationships between two such variables. 

However, there are cases in which one variable clearly depends on 
another variable: for example, the weight of a person depends on the 
age or, to refer to some examples used in this manual, stability of crop 
production is affected by climatic variables and by crop diversity (fig- 
ure 1), or grain yield depends on cycles of selection (figure 10). 

These cases are defined in mathematics as functions (a variable called 
dependent 1s a function of another variable called independent). In sta- 
tistics these cases are analysed using regression. 

The calculation of both the correlation coefficient and the regression 
coefficient is straightforward and can be done with the “Data Analysis 
Tool” available in Excel or with more sophisticated software. 
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10.7.1. The correlation coefficient 


Supposed we have two variables, X, and X, as in table 5, where X, is 
the height with fertilizer and X, is the height without fertilizer of 11 wheat 
varieties. These values are repeated in the first two columns of table 7. 

The first step is to calculate the squares of the deviations from the 
mean that we have already done: 

Xx,” = 8,472 and 2x,” = 7,748 

We need now to calculate a new term, which is the sum of the prod- 
ucts of the deviations from the respective means, shown in the last two 
columns of table 7, namely: 

2 x,x, = (-40 x 33) + (23 x -26)+ ....... + (21 x 32) = -2,888 

The correlation coefficient (r) is simply 

r= x,x,/V (2x,”) x (2x,”) = -2,888 /N8,472 x 7,748 = -0.35646 


Table 7. The data of table 5 with the deviations from the means need- 
ed for the calculation of the correlation coefficient between X, and X, 


Variety | with fertilizer (X,) | without fertilizer | deviations from X, deviations from X, 
(X,) 

1 57 89 -40 33 
2 120 30 23 -26 
3 101 82 4 26 
+ 137 50 40 -6 
5 119 39 22 -17 
6 117 22 20 -34 
7 104 57 7 1 
8 73 32 -24 -24 
9 53 96 -44 40 
10 68 31 -29 -25 
11 118 88 21 32 

Total 1,067 616 0 0 

Mean 97 56 0 0 


The correlation coefficient is a pure number with no dimensions. This 
means that X, and X, can be coded before the calculation, and the result 
will not change. Furthermore, r always lies between -1 and +1. A posi- 
tive value indicates than X, and X, tend to increase together; a negative 
value indicates that when X, increases, X, decreases or vice versa; a 
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value of 0 indicates that each of the two variables changes independent- 
ly from the other. 

To test the null hypothesis that 7 is significantly different from zero, 
we need, as already done in the case of the ¢ test, to compare the ob- 
served value with the value expected by chance based on the sample 
size we used. These values are tabulated and are available in all statisti- 
cal books as well as online at: 

https://www.real-statistics.com/statistics-tables/pearsons-correla- 
tion-table/ 

In the case of 7, the number of degrees of freedom to use in consult- 
ing the table is (n-2) and not (m-/). In fact, as an extension of what we 
discussed earlier and using the data of table 7 as an example, we have 
now 11 pairs of deviations, but each of the last two deviations does not 
add any additional information to the previous ones. Therefore, in the 
case of table 7 we have (n-2) = 9 degrees of freedom. For this number 
of degrees of freedom, the critical levels of r are 0.602969 for P = 0.05 
and 0.734786 for P = 0.01; therefore, our correlation coefficient is not 
significantly different from 0, and the plant height of the varieties after 
receiving the fertilizer is independent from the plant height of the same 
varieties without fertilizer. 

Note that the sign of 7 is ignored when making the test and, as already 
mentioned in the case of the ¢-test, the critical values of r to reject or not 
to reject the null hypothesis decrease considerably as (n-2) increases. 


10.7.2. The regression coefficient 


As introduced earlier, in the simple regression analysis we still deal 
with two variables (the method can actually be extended to more than 
two variables). The difference with the correlation analysis lies in the 
fact that in this case one variable depends on the other. The independent 
variable (indicated with X) is usually a variable, which takes values 
fixed by the scientist, for example the amount of fertilizer, a time series 
(hours, days, years), elevations, cycles of selection, etc. It could also 
be a variable, which can be measured, but not chosen, as for example, 
temperature, plants’ age, weight of an individual, etc. The dependent 
variable (indicated with Y), is a variable, which, as mentioned earlier, is 
affected by the changes in the independent variable (for example yield 
is affected by fertilizer, growth is affected by age, etc). 
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The distinction between dependent and independent variables is im- 
portant because it implies that while X is a fixed quantity, not affected 
by errors, Y is a typical variable taken at random from a normally dis- 
tributed population at each of the X values. 

To illustrate the regression analysis, we will use the data of table 8, 
where in the first two rows are the age and the height of beans plants 
measured at weekly intervals. The other rows will be discussed as the 
calculation proceeds. 


Table 8. Height (cm) of bean plants taken at weekly intervals. 


Plant age (X) if 2 3 4 5 6 7 
Plant height (Y) 5 13 16 23 33 38 40 
Deviations from X = 4 3 -2 -1 0 1 2 3 
Deviations from Y= 24 -19 -l1 -8 -l 9 14 16 
Expected height Y 5.571 11.714 |17.857 | 24.00 | 30.143 | 36.286 | 42.429 


The first step with data of this type is to build a graph like the one in 
figure 48, using the values of the independent variable (X) in the abscis- 
sa and the values of the dependent variable (Y) in the ordinate. 


Plant height (Y) 


0 1 2 3 4 5 6 7 8 
Plant age (X) 


Figure 48. Graphical relationship between plant age (X) and plant height (Y) in beans 
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It is evident that the height tends to increase linearly with the age of the 
plants and therefore a straight line seems to be the simplest way to repre- 
sent the relationship between these two variables. Keeping in mind that 
each observed value of Y is a sample randomly taken from a normally 
distributed population, we assume that the population of Ys correspond- 
ing to each value of X, has a mean pL, which lies on the straight line and a 
standard deviation o which is the same for all the populations. 


The numerical value of the single corresponding to each X, can be 
computed from the formula: 


w=a+B(X-X) 
that, by putting (X - X) = x, becomes 
w=at Bx 


where 1 is the expected value of Y (indicated as Y) at a given value 
of x and B is the slope of the straight line. 


Note that when (X - X) = 0, the term B (X - X) becomes 0, and 1 be- 
comes equal to a. Therefore, o is the mean of the population of Y when 
(X - X) =0, or when X is equal to X. 


As mentioned earlier when discussing about parameters and esti- 
mates, we cannot know the values of a and B, but we can calculate their 
estimates, which will be affected by a certain error, as follows. 


The straight line drawn in figure 48 is the sample regression of Y on 
X. Its position is fixed by two points: 
1. It passes through the point determined by the mean of the two 
samples, namely X = 4 and Y= 24; 
2. Its slope, b, is the sample regression coefficient and represents the 
rate at which Y changes per unit of X, and is calculated as Uxy/=x°. 


The quantity Xxy is the sum of the products of the deviations of the 
X values from the mean of X, times the deviations of the Y values from 
the mean of Y. These deviations are reported in the third and fourth row 
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of table 8. The quantity £x? is the squares of the deviations of X from 
the mean of X, already seen earlier, and therefore: 


oxy = (-3 x -19)+ (-2x-ll)+....... + (3 x 16) = 172 

ux? = (-3) + (-2P +. ceeee: +(3)? = 28 

Therefore, b = 172/28 = 6.143, meaning that the height of the bean 
plants increases by 6.143 cm per week and the sample regression equa- 
tion becomes: 

Y = Y + bx which, by using the estimates, can be written as Y = 24 
+ 6.143x and can be used to calculate the predicted plant height of the 
bean plants, reported in the last row of table 8. 

The last step of the calculation is to determine the significance of the 
estimates of the parameters of the regression line. 

This can be done though the analysis of variance (ANOVA) of the 
dependent variable Y. 

This variance is, as usual, the sum of squares of the deviations from 
the mean of Y divided by the degrees of freedom. This is the total var- 
iance of Y. 

The sum of squares, using the data of table 8, is: 

x (Y- YP =-19? + -11? +......... + 16? = 1,080 

Before we procced to calculate the variance and look in detail at each 
deviation, we found that they are made of two components. For ex- 
ample, the first Y, namely 5, has a deviation from its mean of -19; this 
deviation is made of two components: 

The first is the deviation of Y from its expected value Y of -5.571: 


Y -Y=5-5.571 =-0.571 
The second is the deviation of the expected value of Y from the mean of Y: 
Y - Y=5.571 - 24 = -18.429 


And we can verify that in fact -0.571+ (-18.429) = -19. 

If we can split a deviation into its components, we can do the same 
with the sum of squares associated with that deviation. Table 9 shows 
the two components of each deviation to facilitate the calculation of the 
respective sum of squares. 
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Table 9. Observed values (Y), expected values (Y), deviations of ob- 
served values from the mean (Y-Y), deviations of observed values from 
the expected values (Y-Y), and deviations of expected values from the 
mean (Y-Y), obtained from the data of table 8. 


xX Y Y y-Y Y-Y Y-Y 
1 5 5.571 -19 -0.571 -18.429 
2 13 11.714 -11 1.286 -12.286 
3 16 17.857 -8 -1.857 -6.143 
4 23 24.000 -1 -1.000 0.000 
5 33 30.143 9 2.857 6.143 
6 38 36.286 14 1.714 12.286 
7 40 42.429 16 -2.429 18.429 
0.000 0.000 0.000 


We can now compute two sum of squares, namely: 

D(Y-Y)? = (-0.571)? + (1.286)? +...0..... + (-2.429) = 23.4286 

D(Y-Y)? = (-18.429)? + (12.286)? +......0.. + (18.429) = 1,056.5714 

and, as expected: 23.4286 + 1,056.5714 = 1,080.0000 

These two sums of squares represent, respectively, one part due to the 
regression [= (Y-Y)"] and one part due to the deviations from the re- 
gression: this second part is due to the fact that the points corresponding 
to the observed values of Y do not lie precisely on the regression line. 

It is obvious that the larger this second component, the lower the 
goodness of fit of the straight line to the observed values. The results of 
these calculations can be summarized as shown in table 10. 


Table 10. ANOVA of the Y values of table 8. 


Sources Sums Degrees Mean squares F P 
of variation of squares of freedom 
Regression 1,056.5714 1 1,056.5714 225.5 <0.01 
Deviation from 23.4286 5 4.6857 
regression 


Total 1,080.0000 6 
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Following the same logic that we introduced in discussing the ¢-test, 
we can test the null hypothesis, which, in this case is that the two es- 
timates of variance, namely 1,056.5714 and 4.6857, are independent 
random samples from the same normal populations with a mean pi esti- 
mated by the grand mean (24.0) and a variance o”. 

In this case, the test to be used is the F-test, which is calculated as = 
mean squarel/mean square2, where mean square! is the larger mean 
square. 

The table of F' is much more complex than the ¢ table because it has 
three dimensions: the degrees of freedom of the first mean square, the 
degrees of freedom of the second mean square, and the probabilities. A 
simplified table for 5% probability is shown in table 15. 

The table is used as follows: we look at the column with | degree of 
freedom and at the row with 5 degrees of freedom and we find a value 
of 6.608, much smaller than 225.5. Therefore, we can reject the null hy- 
pothesis because it has a probability lower than 5%, and we can declare 
the difference between varieties as significantly different at p < 0.05. 


The mean square of the deviation from regression (s°= 4.6857) is 
now used to calculate the standard errors, hence the significance, of a 
and b as follows: 

The standard error of b is obtained as follows: 

= Vs 7/E(X >. 

and given that b = 6.143, s7 = 4.6857 and X(X - X)? = 28 

s, = V4.6857/28 = V0.1673 = 0.4091 

and therefore b = 6.143 + 0.4091 

The t-test will be, as described earlier, t = 6.143/0.4091 = 15.02 
which, with 5 degrees of freedom (the same as for the deviations from 
regression) is highly significant. In fact, the ¢ values for 5 degrees of 
freedom are 2.57 (P=0.05) and 4.03 (P=0.01) (table 6). 


It should be noted that the value of t = 15.02, is the square root of the 
F value (225.5) in the analysis of variance of the regression. 
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Remembering that a = Y, the standard error of a is obtained as follows: 
8,°=S*y=l/ns” 
s =Vi/ns?=s/vn 
a yy: y 
using the values of table 10 
8” = 4.6857 s? = 1/7 * 4.6857 = 0.6694; s, = V0.6694 = 0.8182 


This is useful to test, using t-test, whether the regression line goes 
through specific points. For example, we may want to know if the re- 
gression line Y = -0.572 + 6.143 (X) that we found earlier, goes through 
the origin of the axes. To answer this question, we first calculate the 
value of Y when X = 0: 

= -0.572 + 6.143 (0) = -0.572 

Then we need to calculate the level of significance of the difference 
between -0.572 and 0, using, as we have done earlier, the standard error 
of en difference. We first obtain the variance of the differences: 

=s7 +s,? (X - X) 
oa which the standard error is s, = = 5s y/n +s? (X-X). 
Using the values already calculated we have: 
8” = 4.6857 s,” = (0.4091) =0.1673 
xe 0X=4n=7 
8,’ = 4.6857/7 + 0.1673 (0-4)? = 3.3462 

ee t = -0.572-0/V3.3462 = -0.572/1.8293 = -0.3123 with 5 de- 
grees of freedom. 

Such a ¢ value has a probability of occurring by a chance of between 
70 and 80% and therefore we can conclude that, most likely, the straight 
regression line just calculated goes through the origin. 


10.8. The experimental error 


The experimental error is a measure of the variation in an experiment 
that cannot be explained by the factors that we control in the experiment. 
In the case of agricultural experiments, this variation is typically due to 
differences in soil structure, depth, fertility patches, presence of weeds, 
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uneven soil tillage, etc. To measure the variation due to the experimental 
error, we need to evaluate more than once by replicating (see below) at 
least some of the treatments, for example varieties. If one genetically un1- 
form variety is planted in different pots or in different spots in the same 
field and we measure, for example, its height, grain size, yield, etc., those 
measures should be identical. If they are not, this must be due to differ- 
ence between the pots, or between the spots where they were planted. The 
variance associated with these uncontrolled differences is the error vari- 
ance; the smaller the error variance and the higher the number of degrees 
of freedom (keeping in mind what we just said above commenting table 
6), the higher the chance is that we will detect even small differences. 


10.9. Replication 


As already mentioned, replication — repeating, for example, varieties 
more than once within the experiment — is a prerequisite to a correct (un- 
biased) estimate of the experimental error and therefore to the ability to 
calculate the probability that the observed differences are due to chance. 

The concept of replication was introduced by Sir Ronald Fisher in 
1926 as a time-efficient way to obtain an estimate of experimental error 
from the trial itself. Previously, in the nineteenth and early twentieth 
centuries, the same trial — for example, of evaluating the effect of ma- 
nure application — was repeated for several years in the so-called “uni- 
formity trials” before one could validate the effect of the treatment. 

The number of replications is a critical choice in designing an exper- 
iment, because replications is a way to reduce the experimental error. 
However, beyond a certain number of replications, there is no addi- 
tional benefit because of the additional cost involved as we have seen 
earlier discussing the relationship between the degrees of freedom and 
the size of t. 


10.10. Randomization 
Randomization is critically important to allow the unbiased assign- 


ment of treatments to experimental units (in the case of variety tri- 
als, for example, the treatments are the varieties and the experimental 
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units are the plots) and is a prerequisite for obtaining a valid estimate 
of the experimental error. One of the most common mistakes, in the 
layout of trials, besides not using replications, is to avoid the rand- 
omization in the first replication so that the plot order corresponds to 
the treatment order, which is statistically incorrect. Another common 
mistake in METs is to use the same randomization in all the loca- 
tions within the same year. This practice is also statistically incorrect. 
These three types of mistakes (no randomization, no randomization in 
the first replication, and the same randomization for all the locations) 
are unfortunately still very common. 


10.11. Software for randomization 


Randomization can be done with a variety of software, as available 
in R. One software, which is particularly recommended is DiGGeR, 
which can be freely downloaded from: http://www.nswdpibiom.org/ 
austatgen/software/. 

DiGGeR generates optimal experimental designs such as incomplete 
block designs, row-column designs and spatial designs. DiGGeR was 
developed for cereal variety trials with plots in rectangular arrays. DiG- 
GeR is available as a standalone executable and as an R package. The 
standalone version runs from an input file or interactively in a com- 
mand window. The R package generates search specifications which 
can be modified before the search is run. Functions in R provide search 
methods for common design types. 


10.12. Experimental designs 


An experimental design is a mechanism to make statistically valid 
comparisons and is guided by the level of variability within the experi- 
mental material and the size and shape of the experimental unit. 

The experimental unit in an agricultural experiment can be a pot in 
the greenhouse, a plot in the field, or a petri dish in a lab. It is the small- 
est unit of the experimental material. 

The experimental material could be the seed used to study germi- 
nation or dormancy, or treatments such as plus or minus fertilizer in 
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agronomy experiments, or accessions, or varieties or mixtures, or pop- 
ulations in the field to compare yield and other characters. 

For the sake of brevity, we will use the generic term “entries” to indi- 
cate the experimental material. 

In an experimental design, every entry will be assigned to a different 
experimental unit. 


Table 11. Experimental designs useful in EPB trials. 


Material Available Experimental design 


Several entries, little seed available per entry Unreplicated with systematic checks, or par- 

such as initial evaluation of a germplasm col- tially replicated (p -rep) in rows and columns 

lection prior to assembling an EP or incomplete blocks in two replications in 
rows and columns 


Fewer entries, more seed available per entry Incomplete blocks in two replications in 
rows and columns or RCBD with individual 
farms as replications or unreplicated with 
large plots and sampling within plots 


Table 11 shows some experimental designs that can be used for 
the evaluation of EPs, but also for other types of entries. The choice 
of the designs is dictated by a combination of factors such as 1) the 
amount of seed available, 2) the average farm size in the target area, 
3) the characters to be measured, and 4) the number of locations that 
we intend to use. 


10.12.1 Randomized Complete Block Design (RCBD) 


The Randomized Complete Block Design is by far the most popular 
and most frequently used experimental design. It consists of dividing 
the field in blocks (also called replications or replicates), usually 2 to 
4, and then randomly assigning the entries (for example varieties) to 
different positions (plots) within each block. 

The objective is to keep the variability within each block as small as 
possible. It is therefore advisable to avoid rectangular shapes with one 
very long side and to prefer a shape as close as possible to a square. 
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Blocks should be small (see below) and should all have the same 
shape. In the case of obvious gradients of soil fertility, soil type (as 
revealed by soil colour or presence of stones), and slope, the blocks 
should be laid down perpendicular to the gradient (figure 49). 

This is because in the case of a perpendicular arrangement of the 
blocks (left side of figure 49), the effect of the gradient will be captured 
by the variance between blocks while, with the arrangement parallel to 
the gradient, the effect of the slope will affect the comparison between 
varieties within blocks, with those varieties towards the bottom of the 
slope being favoured because of the accumulation of water and nutri- 
ents. Additionally, soil at the bottom of a slope is usually deeper than at 
the top. This will be clearer when we will illustrate how to analyse the 
data of a RCBD design. 


Right 
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Figure 49. Arrangements of blocks in the case of a gradient (in this case a slope) 


The major problem with the randomized block design in agri- 
cultural experiments is that the area corresponding to one block is 
supposed to be uniform. This assumption is usually valid for ex- 
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periments conducted in the rather uniform conditions of a research 
station and/or for a small number of entries. With a large number 
of entries and/or for experiments conducted in farmers’ fields, the 
RCBD design is seldom suitable. 


10.12.1.1. Analysis of Variance (ANOVA) of a Randomized Complete 
Block Design 

The ANOVA of the data collected from an experiment designed as 
Randomized Complete Block Design (RCBD) will be shown using the 
data of table 12 obtained from a simple experiment with 5 varieties and 
4 blocks. 

The ANOVA reported in table 10 consists of partitioning the total 
variance, calculating first the sum of squares (SS) of the deviation from 
the grand mean 17.8, starting from the first block of variety 1 until the 
last block of variety 5: 

SS = (x - x)? = (15 - 17.8) + (18 - 17.8) +.... (15 - 17.8)? + (16 - 
17.8)? = 223.2 

The objective of the analysis is to partition the total SS of 223.2 be- 
tween the possible causes of variation. One cause is the variety, because 
we have used 5 different varieties arranged in 4 different blocks which 
are the second cause of variation. 


Table 12. Data from a Randomized Complete Block Design arranged 
for the ANOVA. 


: Blocks 
Variety 
1 2, 3 4 Totals means 

1 15 18 17 18 68 17.0 
2 16 15 13 16 60 15.0 
3 23 25 22 24 94 23.5 
4 20 16 14 16 66 16.5 
5 20 17 IS 16 68 17.0 
Totals 94 91 81 90 356 

means 18.8 18.2 16.2 18.0 17.8 
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The SS due to these two causes can be calculated as follows: 

The SS due to varieties is calculated as the sum of the deviations of 
the means of the varieties from the grand mean multiplied by the num- 
ber of blocks: 

SS = (17- 17.8)? + (15 - 17.8)? + (23.5 - 17.8)? + (16.5 - 17.8)? + (17 
- 17.8) = 43.3 * 4= 173.2 

Similarly, the SS due to blocks is calculated as the sum of the devia- 
tions of the means of the blocks from the grand mean multiplied by the 
number of varieties: 

SS = (18.8 - 17.8)? + (18.2 - 17.8)? + (16.2 - 17.8)? + (18 - 17.8) = 
3.76 * 5 = 18.8. 

The total of the last two SS, namely the SS due to varieties, 173.2, 
and the SS due to blocks, 18.8 is 192. Therefore, there is a SS of: 

SS nui? (SS ves OY aig! 229-2 > 73.2 718.8) = 31.1 whichis 
due neither to varieties, nor to blocks. 

This is the SS due to causes that we did not or could not control in 
the experiment, since we only controlled varieties and blocks. Such an 
uncontrolled SS is the experimental error, namely an estimate of random 
variability. The source of the error can be understood by examining, for 
example, how the varieties behave in blocks | and 2. Block 1 has a higher 
mean (18.8) than block 2 (18.2). However, while this can also be found in 
variety 2 (16 in block 1 and 15 in block 2), variety 4 (20 in block 1 and 16 
in block 2) and variety 5 (20 in block 1 and 17 in block 2), both varieties 
1 and 3 have larger values in block 2 than in block 1. 


Table 13. Partial results of the ANOVA of the data of table 12. 
Source of variation (sv) df Sum of squares 
Varieties 4 173.2 
Blocks 3 18.8. 
Error 12 31.1 
Total 19 223.2 


We can now arrange the results as in table 13, keeping in mind 
that we have a total of 5 varieties x 4 blocks = 20 data with 20-1 = 
19 degrees of freedom, 5 varieties with 5-1 = 4 degrees of freedom 
and 4 blocks with 4-1 = 3 degrees of freedom. The degrees of free- 


Experimental designs and Statistical analysis 137 


dom for the error are obtained by difference (19-(4+3)) = 12, but in 
the case of a data set such as the one in table 12, can be obtained as 
the product of the degrees of freedom of the two factors controlled 
in the experiment. 

The next step is to calculate the mean squares (or sample variances), 
by dividing the SS by the respective degrees of freedom (table 14). 


Table 14. Results of the ANOVA of the data of table 12. 


Source of variation (sv) df Sum of squares Mean Square F P 
Varieties 4 173.2 43.3 16.7 | <0.05 
Blocks 3 18.8. 6.26 2.4 | >0.05 
Error 12 31.1 259 

Total 19 223.2 


At this point we are ready to perform a test of significance. In this 
case, the null hypothesis is that the three estimates of variance, namely 
43.3, 6.26, and 2.59, are independent random samples from the same 
normal populations with a mean pl estimated by the grand mean (17.8) 
and a variance 0”. 

As we saw earlier in the case of linear regression, the test to be used 
is the F- test = mean squarel/ mean square2 where mean square] is the 
larger mean square. 

In our case, the two values of F are 43.3/2.59 = 16.7 for the varieties 
and 6.26/2.59 = 2.4 for the blocks. 

As we did in the case of the t-test, these values must be compared 
with those expected in the case that the null hypothesis is true. 

The table of F (table 15) is used as follows: in the case of varieties, 
we look at the column with 4 degrees of freedom and at the row with 
12 degrees of freedom and we find a value of 3.259, much smaller than 
16.7. Therefore, we can reject the null hypothesis because it has a prob- 
ability lower than 5%, and we can declare the difference between vari- 
eties as significantly different at P < 0.05. 

It is possible to calculate the exact probability of a given F ratio on- 
line by going to: https://www.socscistatistics.com/pvalues/fdistribu- 
tion.aspx 


138 Evolutionary Plant Breeding 


After entering the F' value, the degrees of freedom of the numerator, 
and the degrees of freedom of the denominator, we can choose the prob- 
ability level (0.01. 0.05 or 0.10). In our case, after pressing “calculate’”’, 
we obtain a p-value of .000076, which is the probability of finding by 
chance an F value of 16.7 with the number of degrees of freedom given. 

In the case of blocks, we look at the column with 3 degrees of free- 
dom and at the row with 12 degrees of freedom and we find a value of 
3.49, larger than 2.4: therefore, in this case, the null hypothesis had a 
probability higher than 5% and there is no reason to reject it. In other 
words, the means of the 4 blocks are not significantly different, proba- 
bly because the soil was very uniform. The two mean squares of 6.26 
and 2.59 are therefore the estimates of the same o” and could thus be 
combined. This is usually done by summing the two SS (18.8 + 31.1 
= 49.9) and dividing by the sum of the respective degrees of freedom 
(3 + 12 = 15). We cannot combine the mean squares because they do 
not have additive properties. The new, pooled estimated of the error 
variance is thus 49.9/15 =3.32. This operation is always recommended, 
particularly when there is a small number of degrees of freedom for the 
error. The general rule is that the higher the number of degrees of free- 
dom, the more accurate the estimate of the error variance. However, as 
we noticed in the case of the t-test, there is little justification for having 
more than about 30 degrees of freedom for the error variance because 
beyond this value there is a small decrease in the critical value of F un- 
less both mean squares have few degrees of freedom. 
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Table 15. Values of F at 5% probability. 


df2=1/ 161.4 199.5 | 215.7 | 224.6 | 230.2 234.0 236.8 238.9 240.5 241.9 243.9 245.9 248.0 249.1 | 250.1 251.1 252.2 253.3 254 


2 118.51 19.00 19.16 19.25 19.30) 19.33 19.35/19.37 19.38 19.40 19.41 1943 19.45 19.45 19.46 19.47 19.48 19.49 19. 
3 |10.13 9.552 9.277 9.117 9.014/8.941 8.887 8.845 8.812 8.786 8.745 8.703 8.660 8.639 8.617 8.594 8.572 8549 85 
4 17.709 6.944 6.591 |6.388 6.256 6.163 6.094 6.041 5.999 5.964 5.912 5.858 5.803 5.774 5.746/5.717 5.688 5.658 5.6 
5 16.608 5.786 5.409 5.192 /5.050/4.950 4.876 4.818 4.772|4.735 4.678 4.619 4.558 4.527 4.496 4.464 4.431 4398 43 


6 15.987 5.143 4.757 4.534 4.387 | 4.284 4.207 4.147 4.099/4.060 4.000 3.938 3.874 3.841 3.808 3.774 3.740 3.705 3.6 
7 15.591 4.737 4.347 4.120 | 3.972|3.866 3.787 3.726 3.677 |3.637 3.575 3.511 3.445 3.410 3.376 | 3.340 3.304 3.267 3.2 
8 15.318 4.459 4.066 3.838 3.688/3.581 3500/3438 3.388 3.347 3.284 3.218 3.150 3.115 3.079 3.043 3.005 2.967 29 
9 |5.117 4.256 3.863 3.633 3.482/3.374 3.293/3.230 3.179 3.137 3.073 3.006 2.936 2.900 2.864 2.826 2.787 2.748 2.7 
10 /4.965 4.103 3.708 3.478 3.326 3.217 3.135 3.072) 3.020) 2.978 2.913 2.845 2.774 2.737 2.700 2.661 2.621 2580/25 


11 14.844 3.982 3.587 3.357 3.204 3.095 3.012 2.948 | 2.896 2.854 2.788 2.719 | 2.646 2.609 2.570 2.531 2.490 2.448 24 
12 |4.747 3.885 3.490 | 3.259 3.106 2.996 2.913 2.849 2.796 2.753 | 2.687 2.617 2.544 2.505 2.466 | 2.426 2.384 2.341 2.2 
13 | 4.667 3.806 | 3.411 | 3.179 3.025 2.915 2.832 2.767 2.714 2.671 | 2.604 2.533 2.459 2.420 2.380 | 2.339 2.297 2.252 2.2 
14 |4.600 3.739 3.344 3.112 2.958 2.848 2.764 2.699 2.646 2.602 2.534 2.463 2.388 2.349 2.308 2.266 2.223/ 2.178 2.1 
15 [4.543 3.682 3.287 | 3.056 2.901 2.790 2.707 2.641 2.588 2.544 2.475 2.403 2.328 2.288 2.247 | 2.204 2.160 2.114 2.0 


16 |4.494 3.634 | 3.239| 3.007 2.852 2.741 2.657 2.591 2.538 2.494 | 2.425 2.352 | 2.276 2.235 2.194 | 2.151 | 2.106 | 2.059 2.0 
17 |4.451 3.592 | 3.197 | 2.965 2.810 2699 2.614 2.548 2.494 2.450 | 2.381 2.308 2.230 2.190 2.148) 2.104 2.058 2.011 19 
18 [4.414 3.555 3.160 2.928 2.773 2.661 2.577 2.510 2.456 2.412 | 2.342 2.269 2.191 2.150 2.107 | 2.063 2.017 1.968 1.9 
19 [4.381 3.522 3.127 2.895 2.740 2.628 2.544 2.477 2.423 2.378 2.308 2.234 2.156 2.114 2.071 | 2.026 1.980 1.930 1.8 
20 14.351 3.493 | 3.098 | 2.866 2.711 2.599 2.514) 2.447 2.393 2.348 2.278 2.203 2.124 2.082 2.039 1.994 1.946 | 1.896 1.8 


21 14.325 3.467 3.072 2.840 2.685 2.573 2.488 2.420 | 2.366 2.321 2.250 2.176 2.096 2.054 2.010 1.965 1.916 1.86618 
22 14.301 3.443 | 3.049 | 2.817 2.661 2.549 2.464 2.397 2.342 2.297 | 2.226 | 2.151 2.071 2.028 | 1.984/ 1.938 1.889 1.838 1.7 
23° 14.279 3.422 | 3.028 | 2.796 2.640 2.528 2.442) 2.375) 2.320 2.275 | 2.204 | 2.128 2.048 2.005 1.961 | 1.914) 1.865 1.813 1.7 
24 14.260 3.403 | 3.009 | 2.776 2.621 | 2.508 2.423 2.355 2.300 2.255 | 2.183 | 2.108 2.027 1.984 1.939 | 1.892 | 1.842) 1.790 1.7 
25 | 4.242 3.385 | 2.991 | 2.759 2.603 2.490 2.405) 2.337 2.282 2.236 | 2.165 | 2.089 2.007 1.964 1.919) 1.872) 1.822) 1.768 1.7 


26 14.225 3.369 2.975) 2.743 2.587 2474 2.388 2.321 2.265 2.220/ 2.148 2.072 1.990 1.946 1.901 | 1.853 1.803 1.749 1.6 
27 =|4.210 3.354 2.960 2.728 2.572) 2.459 2373 2.305 | 2.250) 2.204 2.132 2.056 1.974 1.930 1.884 1.836 1.785 1.731/1.6 
28 |4.196 3.340 2.947 2.714 2.558 2.445 2.359 2.291 | 2.236 2.190 2.118 2.041 1.959 1.915 1.869 1.820 1.769 1.714/16 
29 14.183 3.328 | 2.934) 2.701 2.545 2432 2.346 2.278 2.223 2.177 | 2.104 | 2.027 1.945 1.901 1.854) 1.806 1.754 1.698 1.6 
30 /4.171 3.316 2.922 2.690 2.534 2.421 2.334 2.266 | 2.211 2.165 2.092 2.015 1.932 1.887 1.841 1.792 1.740 1.683|1.6 


40 |4.085 3.232 2.839 2.606 2.449 2.336 2.249 2.180 2.124|/2.077 2.003 1.924 1.839 1.793) 1.744 | 1.693 1.637 1.577/15 
60 [4.001 3.150 | 2.758 2.525 2.368 2.254 2.167 2.097 2.040 1.993 | 1.917 1.836 1.748 1.700 1.649) 1.594 1.534 1467 1.3 
120 |3.920 3.072 2.680 | 2.447 2.290 2.175 2.087 2.016 1.959 1.910 1.834 1.751 1.659 1.608 1.554 / 1.495 1.429 1.352 1.2 
|_inf 3.842 2.996 2.605 2.372) 2.214) 2.099 2.010 1.938 1.880) 1.831 1.752 1.666 1.571 1.517/ 1.459) 1.394 1.318 1.221/ 1.0 


10.12.2. a — designs or incomplete block designs 


In 1976, Patterson and Williams (1976) introduced a type of design 
called incomplete block designs. 

As the name indicates, blocks in this design are incomplete because 
each one of them contains only part of the entries. The first designs of 
this type had the restriction that if you have (v) varieties with block 
size (k), v must be a multiple of k, 1.e., v = A*s where s is the number of 
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incomplete blocks of the same size as k. For example, with 40 varieties, 
is possible to have 10 incomplete blocks of size 4 (i.e., with 4 varieties 
in each incomplete block) or 8 incomplete blocks of size 5 (1.e., with 5 
varieties in each incomplete block). 

The main advantage of incomplete block designs is that they assume 
that the area on an incomplete block is uniform, rather that the much 
bigger area of a block (replication) in the RCBD design, and therefore 
they are much more suitable than the RCBD to test a large number of 
entries, or to conduct an experiment in conditions such as terraced ag- 
ricultural landscapes, where every terrace can become an incomplete 
block. 

With the incomplete block designs we expect a reduction of the error 
variance, which is generally higher when the number of varieties is 
large (Patterson and Hunter 1983). 

A comparison between an ANOVA of an incomplete block design 
and of a randomized block design (table 16) is useful to see the changes 
in the partitioning of the degrees of freedom of the error. 


Table 16. Comparison between the ANOVA for a RCBD and incom- 
plete block designs in the case of 18 entries and 2 replications. In the 
incomplete block design, we used incomplete blocks of size 3 (only 
degrees of freedom are shown). 


a ANOVA (df) 
PRUnEeOr vananOn IY) RCBD Incomplete Block 
Entries 17 17 
Replications 1 1 
Blocks within replications 10 
Error 17 7 
Total 35 35 


In the incomplete block design, 10 of the 17 degrees of freedom of the 
error in the RCBD are now associated with blocks within replications: 
with blocks of size 3 we have 18/3 = 6 incomplete blocks within each 
replication contributing 6-1 = 5 degrees of freedom x 2 replications = 
10 degrees of freedom. On the one hand this is a penalty, because the 
critical F will be higher, but on the other, this could be an advantage if 
there is a proportionately higher reduction in the error variance. 
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10.12.3 Augmented design 


When the total number of the entries to be tested vary, for example, 
between as few as 15 to as many as 160 at each location (but in some 
cases can be as high as few thousand), a compromise must be sought 
between the plot size and the number of locations. This compromise 
is reached by sacrificing replications in favour of locations as done in 
the majority of the modern breeding programs (Portmann and Ketata 
1997). This recognizes that, in the early stages of a plant breeding pro- 
gram, ranking of genotypes is more important than accurately predict- 
ing their yields (Kempton and Gleeson 1997) and the variance due to 
GEI is larger than the experimental error variance; these trials can be 
grown using from one to four checks repeated a sufficient number of 
times to have at least 30 degrees of freedom for the estimate of the error 
variance. This design is commonly known as Augmented Design. 

In an augmented design, the checks commonly used are established 
varieties, because a) they represent what the breeding program aims 
replacing and b) for such varieties seed is usually available in sufficient 
quantity. The remaining entries are breeding material still under selec- 
tion. This is the weak point of the augmented design because the entries 
which are used to estimate the error variance are different from those 
which are under testing. If the entries used to estimate the error variance 
interact with the surrounding physical environment differently from the 
lines under testing, the error variance may result overestimated or un- 
derestimated. 

This problem is solved by using partially replicated designs. 


10.12.4. Partially replicated designs (p-rep) 


These types of designs were initially proposed by Cullis et al. (2006) 
for early generation variety trials as an improvement over the augment- 
ed design. It has been demonstrated that p-rep designs in rows and col- 
umns (see below) resulted in higher genetic gains. 

Furthermore, the p-rep designs respond well to a very frequent prob- 
lem in breeding trials comparing different varieties, namely the fact 
that, almost invariably, in breeding programs there is always a different 
amount of seed for different entries. 
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With the experimental design presented so far, the size of the exper- 
iments (plot size, number of replications, and number of locations) is 
dictated by the entry with the smaller amount of seed. The solution was 
often to drop that entry, which did not completely solve the problem be- 
cause when breeding trials are conducted with entries having different 
amounts of seed, as is usually the case, there will be always an extra 
amount of leftover seed to dispose off. 

The problem is at least partially solved with the p-rep design because 
we can use a variable number of replications: the entries with more seed 
will be replicated more often than those with less seed. These can then 
be part of the trial with just one replicate per location. The additional 
advantage is that the error variance in a p-rep design is estimated using 
many more, and more diverse, entries than (for example) in the aug- 
mented design. However, it is still possible to include replicated plots 
of well-known varieties, which in a participatory program are useful for 
the farmers to observe and compare with new entries. 


10.12.5. The spatial control of heterogeneity: the row and column 
design 


In experiments conducted in farmers’ fields, an additional improve- 
ment in precision can be obtained by controlling the variability in two 
directions, thus reducing the experimental error. 

There are various types of row and column designs that can be de- 
signed either as RCBD or as p-rep using the program package available 
in R (Coombes 2009), called DiGGeR, described earlier. This program 
optimizes randomization, making sure that, particularly in the case of 
p-rep design, the replicated entries are uniformly distributed throughout 
the experimental area. This allows for a unbiased estimate of the exper- 
imental error. 

A script for the randomization is given below. The script requires 
the library DiGGeR and provides as an example a trial with 118 entries 
(nt=118) arranged in 8 rows (nrd=8) and 24 columns (ncd=24) for a 
total of 192 plots (8 x 24). 

The third line contains the specifications for the randomization: the 
first parenthesis (1, 2, 4) indicates the number of replications, while the 
second parenthesis indicates, in the same order, the number of entries. 
In this example, 56 entries are replicated once, 56 entries are replicated 
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twice, and 6 entries are replicated four times. Note that 56 + 56 + 6 = 
118 (total number of entries) and (56 x 1) + (56 x 2) + (6 x 4) = 192 
(total number of plots) 


Library (DiGGeR) 

dp192<-des.prep00(nt=118, nrd=8, ncd=24, 

trep=rep (c (1,2,4), c (56,56,6)), 

tgrp=rep (c (1,1,2), c (56,56,6)), 

ribs=c (4,4,4), 

cibs=c (24,12,6)) 

dp192<-run(dp192) 

mp192 <- getDesign(dp192) 

des.plot (mp192,seq(56),col=8,new=TRUE, label=TRUE,chtdiv=3) 

des.plot (mp192,seq(56)+56,col=7,new=FALSE, label=TRUE,chtdiv=3) 
des.plot (mp192,seq(6)+112,col=4,new=FALSE, label=TRUE, chtdiv=3,bdef= 
cbind(4,6),bwd=4) 

des.tab (mp192,8,24) 

View(dp192$dlist) 

write.csv (dp192$dlist, file =”’prep118.csv”,row.names=FALSE,quote=FALSE) 


In the fourth line, the first parenthesis (1, 1, 2) indicates which entries 
are the test lines (indicated by 1) and which entries are the checks (indi- 
cated by 2). In this example, 112 are the test lines, but those replicated 
once must be kept distinct from those replicated twice, hence 56, 56, 
while the 6 entries replicated 4 times are the checks. 

The following two lines set the conditions for optimizing the random- 
ization by dividing the rows only once in four groups of two rows each 
(ribs=c [4,4,4]) while the 24 columns are divided first in two groups of 
12, each of which is then divided in half (cibs=c[24,12,6]). Therefore, 
there will be a total of eight grids, and in each grid, there will be at least 
one check. With the traditional ways of randomization, it is not possible 
to obtain such a uniform distribution of the checks in the field. 

The script above only runs with older versions of R, such as 2.15.2. 

Table 17 shows how the ANOVA, and the degrees of freedom of the 
error variance, change with a row and column design: the uncontrolled 
variability, which in the RCBD was entirely in the error term and partly 
removed into the incomplete blocks, in the row and column design is 
captured in two directions. 

As a result, the error term has slightly more degrees of freedom than 
in the incomplete block design, but it is expected to greatly decrease in 
size, thus increasing our ability to detect differences between entries. 
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Since the row and column designs have been developed, there is in- 
creasing evidence of their superior efficiency since these designs can be 
analysed with spatial analysis (Singh et al. 2003). 


Table 17. Comparison between the ANOVAs for a RCBD design, an 
incomplete block designs and a row and column design in the case of 
18 entries and 2 replications. In the incomplete block design, we used 
incomplete blocks of size 3 and in the row and column design we used 
6 rows and 6 columns (only degrees of freedom are shown). 


ANOVA (df) 

Source of variation (sv) 

RCBD Incomplete Block | Row and Column 
Entries 17 17 17 
Replications 1 1 
Rows 5 
Columns 5 
Blocks within replications 10 
Error 17 7 8 
Total 35 35 35 


Spatial analysis can be performed in GenStat. Recently, a script to 
perform spatial analysis in R has been published (Velazco et al. 2017) 
and the full script can be found here: https://rdrr.io/cran/SpATS/ 


10.12.6. Unreplicated trials with large plots: One-way ANOVA 


One type of trial that is very useful for the evaluation of EPs when a 
large amount of seed is available is the unreplicated designs with sam- 
pling within plots. 

One example is a trial in which 4 EPs are tested for two years on 5 
different farms (locations). In each farm, the 4 populations are planted 
as a Single (1.e., without replications) 1,000 m? plot with the order of the 
4 plots randomized independently on each farm. The experiment was 
repeated for two years using, for each population and on each farm, the 
seed harvested the previous year from the same population in the same 
farm. 
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A number of different agronomic traits were measured in three ran- 
dom squares of 2m? within each plot. Therefore, the variance between 
the random squares within the same plot, on the same farm, and in the 
same year will be an estimate of the experimental error. 

The ANOVA of this example is shown in table 18. 


Table 18. ANOVA for an unreplicated design with 4 populations, on 
five farms for two years and three samples/plot/farm/year. 


Source of variation (sv) df 
Years 1 
Populations 3 
Farms 4 
Populations x Years 3 
Populations x Farms 12 
Years x Farm 4 
Populations x Years x Farms 12 
Error 80 
Total 119 


With 3 samples x 4 populations x 5 farms x 2 years, we have a total 
of 120 observations, hence the 119 degrees of freedom for the total 
variance. The degrees of freedom for years, populations, and farms are 
straightforward, namely 1, 3, and 4, respectively. Those for the inter- 
actions are obtained by multiplying the degrees of freedom of the re- 
spective sources of variation. Incidentally, in the evaluation of EPs, the 
interactions are of particular interest because they reveal the changes in 
time and space, which are expected to occur as a consequence of diver- 
gent evolution due to natural selection. 

The degrees of freedom of the error can be calculated by difference: 
119 - (1+3+4+3+12+4+12) = 80. However, they can also be calculated 
directly considering that the 3 samples within each large plot generate 
2 degrees of freedom. Therefore, 2 x 2 Years x 5 Farms x 4 populations 
= 80. 

The interest in this type of trial is that because of their size (which 
can be modified according to the country, the crop, and the farm size), 
they can be planted by the farmers and managed as their normal crop. 


146 Evolutionary Plant Breeding 


Furthermore, they generate enough seed for quality and/or nutrition- 
al analysis, and although relatively simple, they generate a sufficient 
number of degrees of freedom for the error to be considered very relia- 
ble. Advantages and disadvantages of this type of trials, also known as 
strip-trials, have been discussed by Yan et al. (2002). 


10.13. Multi-Environment Trials (MET) 


The concept of Multi-Environment Trials (MET) was introduced on 
page 33 in our discussion of GEI. 

Conducting trials in different locations and years is essential to obtain 
information on how varieties, populations, and mixtures, which in this 
context we will define as genotypes, perform in response to various en- 
vironmental conditions, and to study the nature of GEI as we have seen 
from the example above. 

MET are essential in assessing EPs and mixtures over time and 
space since one of the expected advantages of this material is its su- 
perior stability over time and its ability to evolve specific adaptation 
in space. 

MET are normally designed as replicated trials, e.g., RCBD or a-de- 
sign, and are conducted over many locations and years to obtain infor- 
mation on the genotype responses to the environments. For a moderate- 
ly large number of locations and years, two replications per trial have 
been found to be optimal (Kempthorne 1983). In order to properly ana- 
lyze MET, it is important to remember that the randomization MUST 
be different in each location and in each year, which with current com- 
putational facilities does not represent a burden. 

If we use a RCBD in a MET, in addition to entries and replications 
we have two additional factors: years and locations. It is important to 
have an efficient way to collect and store raw data to be able to run a 
combined analysis over genotypes, replications, years, and locations. 

Several methods of analysis can be found in the literature and in re- 
view papers (Singh and El-Shama’a 2015) and one of the most common 
is the GGE biplot that we introduced briefly on page 66. The GGE 
biplot package (Yan et al. 2000) is also available in R (R Development 
Core Team 2015). 
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The concept of GGE originates from the analysis of MET. The yield 
(or any other measure, such as plant height, 1,000 kernel weight, etc.) 
of a genotype in a given environment (location, year) is a combined 
effect of the genotype (G), of the environment (E), and of their interac- 
tion (GEI). In a MET, E normally accounts for about 80% of the total 
variation, and G and GE each account for about 10% (Gauch and Zobel, 
1996; Yan et al. 2000). For the purpose of genotype evaluation, howev- 
er, only G and GE are relevant (Gauch and Zobel, 1996), thus the term 
GGE (Yan et al. 2000). 

After the ANOVA, the means of genotypes and environments (i.e., 
combinations of locations and years) are arranged in a two-way table 
(usually genotype as rows and environments as columns) as in table 19, 
to graphically display the interrelationship among genotypes and envi- 
ronments (Yan and Kang 2003). Note than when using the GGE biplot 
package in R, the cell containing the header of the genotypes column 
must be kept blank. 

The analysis is actually done on data like those in table 19 after data 
standardization. Standardization (done automatically by the software) 
consists of subtracting from each of the data of each column the mean 
of that column and dividing by the standard deviation of the same col- 
umn. This will transform the data of each column in deviation from 0 
(the mean of the column) with a standard deviation of 1. 

As a consequence, the effect of E disappears (all the columns, which 
represent the environments, have a mean of zero) and therefore the bi- 
plot only shows G + GE effects as deviations from 0. Hence the name 
GGE biplot. 

The biplot appears as shown in figure 50, in which the genotypes 
(12 in the case of table 19) are shown by green numbers, and the 
environments (9 in the case of table 19) in blue colour. The accurate 
position of a genotype and an environment is at the beginning of the 
respective label. 


148 Evolutionary Plant Breeding 


Table 19. Two-way table (genotypes as rows and environments, 1.e., 
year-location combinations as columns) of 12 entries evaluated for 3 
years in 3 locations each year 

Yili Vita? Yit3°  Y2L1 “youd Y2L3 Yai -Y312 ~Y3L3 


1 3488 3286 1759 2923 3328 4203 2708 2095 2446 
2 3878 3552 1859 2909 2916 94137 3034 2222 2434 
3 3857 3310 1709 3066 3053 4250 = 2933 2608 2362 
4 3729 3431 1587 = 2944 3083 4272 2970 2048 2536 
5 3795 3329 1769 3035 2860 4041 2961 2313 2559 
6 3694 3343 2116 2980 3004 4228 2714 2188 2656 
7 3752 3239 1950 2946 3278 = 4213 2763 2045 2537 
8 3802 3074 1640 3062 3107. 4271 2920 2764 2511 
9 3614 3147 1893 2941 3325 4202 2769 2145 2548 
10 3901 3465 1779 3059 2856 4278 2928 2359 2647 
ll 3735 3361 1608 = 2993 3296 4142 3103 2584 2566 
12 2593 3533 1905 3087 3142 4179 3214 1973 2626 


GGE Biplot 


25.00% 


PC2 


voir 


+ T 
15 -1.0 05 0.0 05 1.0 


PC1 = 26.93% 


Figure 50. The biplot of 12 genotypes (the green numbers) evaluated in 9 year-locations com- 
bination (in blue). 
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In a GGE biplot, the genotypic PC1 scores are proportional to the 
expression of the traits, while the PC2 scores are proportional to devia- 
tions associated with GEI. 

The sum of the percentages associated with PC1 and PC2 is an indica- 
tion of how much of the total variation in the original data set is captured 
by the biplot. The higher this percentage (at least >60% as total of the two 
axes), the more confident the interpretation based on the biplot will be. If 
only a small portion of the variation is explained, the pattern in the data is 
either complicated (perhaps part of the variation is captured by additional 
axes) or there is no discernible pattern at all. When the data is sufficiently 
approximated by the biplot, the cosine of the angle between the vectors of 
two environments approximates the correlation coefficients between them. 

In particular: 

1. The origin is the point with coordinates PC1 = 0 and PC2 =0 asa 
consequence of the standardization; 

2. Two environments are positively correlated if the angle between 
their vectors is <90° (example Y2L1 and Y3L1 in figure 50); 

3. Two environments are negatively correlated if the angle between 
their vectors is >90° (example Y2L1 and Y2L2 in figure 50); 

4. Two environments are independent (correlation coefficient = 0 if 
the angle between them is near 90° (example Y2L1 and Y1L1 in 
figure 50); 

5. Similar genotypes are positioned closely; genotypes that are sim- 
ilar in GGE value have an acute angle (the angle formed between 
the first genotype, the origin, and the second genotype), such as 
genotypes 7, 9 and | in figure 50; 

6. Dissimilar genotypes have a large/obtuse angle between 90° and 
270°, such as genotypes 9 and 11 in figure 50; 

7. Genotypes far from the origin (example genotypes 12 and 8 in fig- 
ure 50) have a large G plus GE effect. If a given genotype and a 
given environment vector are on the same side of the origin (ex- 
ample genotype 12 and Y1L2 or genotypes 1, 7, and 9 and Y2L2), 
that genotype or those genotypes perform(s) above average in that 
environment. By contrast, a genotype at the opposite side of an en- 
vironment vector origin (examples genotype 12 and Y1L1 or gen- 
otype 3 and Y1L3) performs below average in that environment. 
Genotypes close to the origin (example genotype 4) have average 
performance in all environments (Yan et al. 2000); 
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8. Environments with longer vectors (such as Y3L1 and Y3L2) are 
more discriminating of the genotypes; those with short vectors 
(such as Y3L3) are less discriminating; those located at the biplot 
origin are not discriminating (Yan and Rajcan 2002). 

A very useful function is the “Which-Won-Where/What” (figure 51) 

which is obtained as follows: 

1. A polygon is drawn connecting those genotypes (in the example, 
these are entries 6, 7, 9, 1, 8, 3, 10, and 12), called vertex geno- 
types, which are located furthest away from the biplot origin such 
that all other genotypes are contained within the polygon; 

2. Lines starting at the origin and perpendicular to each side of the 
polygon are drawn. These lines form sectors (the area comprised 
between two lines). 

In the case of figure 51, the perpendicular lines divide the biplot into 

8 sectors with each environment falling into at least one of the sectors. 
The vertex genotype for each sector had the largest values among all 
entries in the environment falling within that sector (in other words, it 
is the winner in that environment). In this example, genotype 12 wins 
in Y3L3, Y1L2, Y2L1, and Y3L1, genotype 8 wins in Y1L1, Y3L2, and 
Y2L3, genotype 6 wins in Y1L3, and genotypes 9 and 1 win in Y2L2. 
Therefore, the 12 environment are divided into four groups based on 
the winners. 
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Which Won Where What 


0.5 


0.0 


25.00% 


PC2 


0.5 


PC1 = 26.93% 


Figure 51. The “Which-Won-Where/What” feature of the GGE biplot, which allows identifying 
the best genotypes for specific environments (year-locations combinations). 


The average performance and stability of the genotypes can be vis- 
ualized using the “Means vs. Stability” feature (figure 52). The most 
important aspects of figure 52 are: 


1. A small circle (in figure 52 it is close to genotype 2) indicates the 
position of the average environment, which is defined by the av- 
erage PC1 and PC2 scores across all environments. This average 
environment can be regarded as a virtual environment; 

2. The green line that passes through the biplot origin and the average 
environment; this is referred to as the mean environment axis; 

3. A green arrow pointing to the average environment from the biplot 
origin; 

4. Ared line that passes through the biplot origin and is perpendicular 
to the mean environment axis; 

5. A set of broken lines perpendicular to the mean environment axis, 
which start from the marker of each genotype. 
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Means vs Stability 


= 25.00% 


PC2 


T T 
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PC1 = 26.93% 


Figure 52. The “Means vs Stability” feature of the GGE biplot, which allows classifying the 
genotypes based on the combination of their means and their stability. 


The projections of the genotypes tested in the experiment on the mean 
environment axis approximate their mean yields (or the mean of the trait 
being plotted), while the projection (the broken lines) to the perpendicu- 
lar axis approximates the GEI associated with the genotype. The longer 
the projection, the greater the GEI, which is a measure of instability. In 
this graphical representation, the ideal genotype, in terms of grain yield 
and stability, is one that has the longest positive projection on the mean 
environment axis (high mean) and a zero projection on the perpendicular 
axis (high stability), such as genotypes 5 and 10. Genotype 12 has a better 
projection on the mean environmental axis than genotypes 5 and 10 but is 
very unstable as shown by the large projection on the perpendicular axis. 
Thus, in the resulting graph, the genotypes are evaluated for a combina- 
tion of high mean and high stability in the sense of low GEI. 

Therefore, the three features of the GGE biplot allows extracting 
from the data of a table such as table 19 the most important information 
for a breeder. 


11. 
CONCLUSIONS 


Evidence from ecological theory, genetic research, and agronomic 
practice shows that evolutionary populations and dynamic mixtures 
represent a dynamic response to the complexity of climate change not 
only in its physical characteristics (temperature and rainfall) but also 
in its biotic aspects and in its location specificity. Evolutionary popula- 
tions and dynamic mixtures, with their capacity to evolve in response 
to both biotic and abiotic stresses, as long as they maintain sufficient 
genetic diversity, appear to be the quickest, most cost-effective, and 
evolving solution to such a complex and evolving problem. They have 
the additional advantage of increasing yield gains resulting from a com- 
bination of natural and artificial selection and genetic recombination. 

They are also able to control pests, which makes them particularly suit- 
ed to organic systems, representing an ecological solution to pest control; 
they do not create a selection pressure favouring the evolution of resist- 
ance as is the case with Genetically Modified Organisms (GMOs) and 
the products of the new breeding techniques (Nbt). Therefore, EPs and 
mixtures will fill an important gap represented by the scarce availability 
of varieties specifically adapted to organic and low-input conditions. 

Because of the evolutionary capability and their ability to control 
pests, they represent at the same time a mitigation and an adaptation 
strategy to cope with climate change. They represent a mitigation strat- 
egy because they reduce considerably the use of chemical inputs, and 
they represent an adaptation strategy associated with their ability to 
continuously evolve to adapt to new combinations of biotic and abiotic 
stresses. 

Finally, as they evolve, they generate a continuous flow of novel, 
cultivated agro biodiversity even within the same crop, which will be 
beneficial in increasing diet diversity and ultimately human health. 
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Therefore, evolutionary plant breeding is capable of reversing the 
trend of modern plant breeding towards uniformity and to reconcile 
plant breeding with the overwhelming scientific evidence of the impor- 
tance of biodiversity. 
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